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Reading the Master: Newton 
and the Birth of Celestial Mechanics 


Bruce Pourciau 


Dedicated to J. Bruce Brackenridge 


One factor that has remained constant through all 
the twists and turns of the history of physical 
science is the decisive importance of the 
mathematical imagination. 
—Freeman J. Dyson 


1. In January of 1684, the young astronomer Edmund Halley travelled from 
Islington up to London for a meeting of the Royal Society. Later, perhaps over tea 
and chocolate at a nearby coffee house, he chatted casually about natural philoso- 
phy and other topics with Sir Christopher Wren and Robert Hooke. Talk soon 
turned to celestial motions, and Halley later reconstructed the conversation [22, 
p. 26]: 


I, having from the consideration of the sesquialter proportion of Kepler 
concluded that the centripetall force [to the Sun] decreased in the proportion 
of the squares of the distances reciprocally, came one Wednesday to town, 
where I met with S‘ Christ. Wren and M‘' Hook, and falling in discourse 
about it, M* Hook affirmed that upon that principle all the Laws of the 
celestiall motions were to be demonstrated, and that he himself had done it. I 
declared the ill success of my attempts; and S' Christopher to encourage the 
Inquiry said that he would give M' Hook or me 2 months time to bring him a 
convincing demonstration thereof, and besides the honour, he of us that did 
it, should have from him a present of a book of 40 shillings. M’ Hook then 
said that he would conceale [his] for some time that other triing and failing, 
might know how to value it, when he should make it publick. ...1 remember 
S' Christopher was little satisfied that he could do it, and though M' Hook 
then promised to show it him, I do not yet find that in that particular he has 
been as good as his word. 


The two month deadline passed. Wren and Halley waited through the summer, but 
still the promised proof from Hooke never came. Finally, in August, Halley would 
wait on Hooke no longer. He carried the question to Cambridge and the Lucasian 
Professor of Mathematics, Isaac Newton. 

Newton’s secretary and attendant has painted a portrait, daubed with colorful 
and concrete detail, of the eccentric Cambridge professor Halley had finally 
decided to approach (12, p. xiii—xiv]: 


I cannot say, I ever saw him laugh, but once...I never knew him take any 
Recreation or Pastime, either in Riding out to take y® Air, Walking, Bowling 
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or any other Exercise whatever, thinking all Hours lost, y' was not spent in 
his Studyes, to w™ he kept so close...so intent, so serious upon [them], y' 
he eat very sparingly, nay, oft times he has forgot to eat at all, so y' going into 
his Chamber I have found his Mess untouch’d, of w™ when I have reminded 
him, [he] would reply, Have I; & then making to y° Table, would eat a bit or 
two standing, for I cannot say, I ever saw Him sit at Table by himself... He 
very rarely went to Dine in y* Hall unless upon some Publick Dayes, & then, 
if He has not been minded, would go very carelessly, w'? Shooes down at 
Heels, Stockins unty’d, Suplice on, & his Head scarcely comb’d... . At some 
seldom Times when he design’d to dine in y° Hall [he] would turn to y° left 
hand, & go out into y° street, where making a Stop, when he found his 
mistake, [he] would hastily turn back & then sometimes instead of going into 
y° Hall, would return to his Chamber again... . 


...in his Garden, w“" was never out of Order, ...he would, at some seldom 


Times, take a short Walk or two, not enduring to see a Weed in it... . When 
he has some Times taken a turn or two [he] has made a sudden Stand, turn’d 
himself about, run up y° Stairs [&] like another Alr]chimides, with an 
evpnka fall to write on his Desk standing, without giving himself the 
Leasure to draw a Chair to sit down on... . 


In a letter from 1727 [22, p. 27], Abraham de Moivre set the scene as Halley, 
having arrived in Cambridge, posed the crucial question to the reclusive mathe- 
matician: 


... after they had been some time together, the D" asked [Newton] what he 
thought the Curve would be that would be described by the Planets suppos- 
ing the force of attraction towards the Sun to be reciprocal to the square of 
their distance from it. S‘ Isaac replied immediately that it would be an 
Ellipsis. The Doctor struck with joy and amazement asked him how he knew 
it. Why saith he I have calculated it... . 


Witness the birth of celestial mechanics: the embryonic question has been an- 
swered— 


every orbital motion subject to an inverse-square force lies on a conic having focus 
at the force center 


—not with a guess, but with a mathematical demonstration! 

Semester after semester, at every college and university, we give our students 
the same answer Newton gave to Halley, our demonstrations—so different from 
Newton’s—blessed by the glories of vector calculus, and in this way we honor 
Newton and celebrate the emergence of celestial dynamics. In the present article, 
we honor Newton in the way of Abel, who counsels us to read the masters. We 
shall place the original argument from Newton’s Principia next to a modern 
counterpart, delighting in the stark contrasts. One delightful difference: Newton’s 
argument requires that we first answer the converse to Halley’s question— 


What force law maintains a conic motion orbiting about the focus? 


—and again, reading the master, we shall juxtapose the Principia’s very geometric 
proof of this reversal with its demonstration by vector calculus. In this mix of old 
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and new, of geometry and analysis, some insights and surprises make their way to 
the surface: 


e The mathematics of the Principia is geometric analysis, both analysis in the 
sense of ‘taking apart’ as well as analysis in the sense of calculus. Newton’s 
geometry is calculus—limits, derivatives, integrals, acceleration, curvature— 
masked as geometry. 

While less precise than their vector calculus descendants, the Principia’s 
definitions have a concrete, visceral character that informs our geometric and 
physical intuition. 

The first ten sections of the Principia (apart from the statement of the Third 
Law) contain no physics, only mathematics. Newton may write of ‘forces,’ but 
he calculates accelerations. His concentration on acceleration and shape 
reminds us that force and mass take no part in the mathematics of the 
one-body problem, which occupies the leading sections of the Principia. 

In contrast to force, curvature is deeply involved with the Principia’s orbital 
dynamics, yet apart from rare oblique sightings, the dependence on curvature 
remains hidden. 

e Asked who should receive credit for answering Halley’s question with a 
demonstration rather than a guess, historians of science bow to Newton. 
Asked for evidence to back up their claim, the historians open the Principia 
and point to a two-sentence argument. We confirm that Newton’s little sketch, 
given air and sun, blossoms into a cogent proof. 

Reading the masters—Archimedes, Newton, Euler, Gauss, Riemann,... 
—can mean entering a foreign paradigm, an unfamiliar mathematical world 
where alien values, language, definitions, tools, strategies, and assumptions 
frustrate our attempts to understand. And so it is with the Principia. But with 
persistence and prayer, even the Principia sends up her secrets. As we slowly 
learn to navigate in Newton’s world, we deepen our understanding of the 
Principia’s paradigm as well as our own. 


It may seem odd to have placed our conclusions here in the introduction, but with 
these closing remarks now out of the way, we can read on unburdened by the 
western need to fret and fuss about the point of it all. As the Taoist philosopher 
Chuang Tzu suggests [19, p. 126], we can now lean back and float with the current, 
“going under with the swirls and coming out with the eddies, following along the 
way the water goes, and never thinking... .” 


2. We begin with Newton’s generalized answer to Halley—that every orbit pro- 
duced by an inverse-square force must lie on a conic—in this section giving a 
contemporary proof and in the next exploring the Principia’s original argument. 
But we should first agree on some technical vocabulary, so that we can be more 
precise. Any smooth map r = r(¢t) from an open interval J into euclidean 3-space 
is a motion. Every motion r has a velocity v = r and an acceleration a = v. For the 
magnitude of a vector, we choose the same letter in nonbold italic: thus, for 
example, r = |r|, v = lv|, and a = |al. (We tacitly assume that r and v (the speed) 
never vanish.) We say the motion r has an inverse-square acceleration provided for 
some nonzero A, 
—A 


a= ——-U 
r- 


for all ¢ in J. Here U stands for the unit direction vector r/r. More generally, 
whenever the cross-product r X a vanishes identically, we call r an orbital motion. 
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If the origin S has some significance—it might be the focus of a conic or the pole 
of a spiral, for instance—an orbital motion may be labelled a motion about S. A 
sentence that would be typical of the Principia, “A body is urged by a centripetal 
force continually directed toward an immovable center S,” becomes briefer in our 
language: “Given a motion about S.” 

Assuming that Mars traversed an ellipse with its position vector sweeping out 
equal areas in equal times, Kepler made predictions in his Astronomica nova of 
1609 that matched the careful observations of Tycho Brahe. In Propositions I and 
II (Section II, Book I) of the Principia, Newton uses this area principle to 
characterize orbital motions in general [11, p. 40 and 42]: 


PROPOSITION I THEOREM I 
The areas which revolving bodies describe by radii drawn to an immovable centre 
of force do lie in the same immovable planes, and are proportional to the times in 
which they are described. 


PROPOSITION II THEOREM II 
Every body that moves in any curved line described in a plane, and by a radius 
drawn to a point either. immovable, or moving forwards with an uniform 
rectilinear motion, describes about that point areas proportional to the times, is 
urged by a centripetal force directed to that point. 


Today of course we translate these propositions into the language of vectors: 


NEWTON’S AREA THEOREM For any motion r = r(t), the following are equiva- 
lent: 

(a) r is orbital 

(b) the (massless) angular momentum h = r X v is constant 

(c) r is planar and sweeps out area at a constant rate 


The proof is simple, especially once we agree that the area swept out is 
1 rt 
—|irxXv{\dt, 
; J Ir xv 


the only slippery step being to show r is planar when h vanishes everywhere, but in 
this case the derivative U vanishes everywhere (recall U = r/r), indicating that the 
motion lies on a fixed ray from the origin. That U remains zero follows from a 
simple fact: 


hx<r 


U=—, (1) 


r 


Halley’s question and Newton’s answer involve the relationship between the 
acceleration of the motion and the shape of the orbit. Moving from acceleration to 
shape, we define the trajectory of a motion r = r(t) to mean the subset {r(t): t € J} 
of 3-space. An orbit is then just the trajectory of an orbital motion. If a trajectory 
lies On a conic, say, or a spiral, we would have a conic or spiral motion. The 
Principian sentence, “A body, urged by a centripetal force continually directed 
toward an immovable center S, moves in a conic section with focus at S,” now turns 
into “Consider a conic motion about S.”’ Of course conics hold some special 
interest for us here, and we recall the following definition: a conic is the locus of 
points whose distance from a given point S (the focus) is some positive constant e 
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d —4 


(the eccentricity) times the distance from a given line (the directrix). Perhaps we 
should put this definition in vector dress, so it will feel more comfortable when 
vector calculus comes to call. If we let r be the position vector from the focus, d 
the distance from the directrix to the focus, and e (the eccentricity vector) a vector 
of length e which points perpendicularly toward the directrix, then the definition 


tells us that 
e 
r= e(d —r-: -|, 
e 


and with the notation U = r/r and / = de, this formula turns into the vector conic 
equation: 


r-(e+ U) =/. (2) 


The constant 7 is called the semi-latus rectum of the conic. Given a positive 
constant / and a nonzero vector e, the vector conic equation defines a conic with 
semi-latus rectum /, eccentricity e = |e|, axis along e, and focus at the origin. When 
e = 0, then (2) describes a circle of radius / about the origin, and if / = 0, we have 
a ray from the origin. 

At this point, we have the vocabulary and background to explore a contempo- 
rary version of Newton’s answer to Halley. Suppose we have a motion r = r(¢) with 
an inverse-square acceleration, so that for some nonzero number A, 


—A 
a(1) = —-U(1) 


for all ¢ in some open interval J. Crossing with the angular momentum h = r X v, 
we have 


axh-=—-U xh 
r 


rxXh 


r> 


which becomes, using (1), 
aX h= AU. 
Now antidifferentiate, remembering that h is constant because r is orbital: 


vxh=AU+ec 
= A(U + e) 
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for some constant vectors c and e = +c. If we dot with r, we find 
1 
yr yx h) =r-(e+ U), 


and then permuting the entries in the scalar triple product uncovers the vector 
conic equation (2): 

2 
rr (e+ U). 
When the constant vector h vanishes, this reduces to U = —e, and the motion 
must then lie on a fixed ray from the origin. If h does not vanish, but e does, we 
conclude r = h?/A, so the orbit lies on a circle centered at the origin. Supposing 
neither h nor e vanishes, we have seen that the vector conic equation (2) defines a 
conic with focus at the origin. And that seals it: 


NEWTON’S SHAPE THEOREM. Apart from motion on a ray from the center, every 
motion with an inverse-square acceleration must be a conic motion about the focus. 


A second proof of the Shape Theorem is quick but sly. Assume again that 


—2X 
a(t) = sr UY) 


Then of course h remains constant, but (surprise!) so does the vector L = 

+v X h — U. To check, compute the derivative: 

hxXr 1/-A hx U 
x — 


; 1 
L=-axh- = —|——U 
A r° A\ r? r? 


= 0 


Now just dot r with L + U, 
1 2 
r-(L+ U) =-—r-(v xh) =—, 
(L + U) = 54r-(v xh) = > 
and we recognize the vector conic equation (2). That’s all there is to it. 
The sly part of this proof is (un)clear: why would one expect the vector 
+v X h — U to be constant? The secret lies in a formula for the eccentricity vector 
e. Given any conic motion r = r(t), if we differentiate the vector conic equation, 


r-(e+ U) =/, 


and solve for the (constant) eccentricity vector e, we obtain the 


ECCENTRICITY FORMULA. For any motion r = r(t) satisfying the vector conic 
equation (2), 


e = a xh-—U. (3) 
Of course we began with an inverse-square motion, not a conic motion, but if we 
had had a conic motion, then the vector (//h?)v X h — U, representing as it does 
the eccentricity vector, would have been a prion’ constant. Knowing that A turns 
out to be h’/I (see our first proof), it seems natural then to suspect that 
L = (1/A)v X h — U should be constant in the case of inverse-square acceleration. 
If you do not like this sneaky proof of the Shape Theorem, blame Laplace. The 
vector L, sometimes called the Laplace-Runge-Lenz vector, has the history of its 
rediscoveries etched in its name. 
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Now that we have seen two contemporary proofs, let us drift back in time, back 
to the 1680s, to examine Newton’s original argument for the Shape Theorem in the 
Principia. 


3. Only with some nervousness, do we open Newton’s monumental work 
Philosophiae Naturalis Principia Mathematica. It had a reputation in 1687; it has a 
reputation still—a reputation for being impenetrable. In the latter half of the 
eighteenth century and on into the nineteenth, this reputation fed a cottage 
industry of writing notes and commentaries devoted entirely to ‘understanding’ the 
Principia. (The industry may have declined, but it still produces excellent commen- 
taries from time to time: witness [5] and [6], just out in 1995.) Always formal, terse, 
and crabbed in his scholarly work, Newton took these stylistic tendencies to their 
limit in the Principia. Why? A decade earlier, his theory of colors had been 
attacked by Leibniz, Hooke, Linus, Lucas, as well as others, and Newton had 
detested the controversy. In a shrill letter to Henry Oldenburg, who was then 
Secretary of the Royal Society, Newton despairs, “I see I have made myself a slave 
to Philosophy, but if I get free of Mr. Linus’s business I will resolutely bid adew to 
it eternally, excepting what I do for my private satisfaction or leave to come out 
after me. For I see a man must either resolve to put out nothing new or become a 
slave to defend it.” [7, p. 198] Of course, Newton did not “leave [the Principia] to 
come out after [him],”’ but he did choose to limit his readership and therefore his 
potential critics by composing in an icy, mathematical style, ultimately producing 
500 pages of dense Latin text—definitions, axioms, lemmas, theorems, proposi- 
tions, demonstrations, scholia, and figures, all fixed in place, a massive ordered 
regiment of abstract formality. According to a close friend of Newton’s (2, p. 168], 
controversy of any kind 


made sr Is[aac] very uneasy; who abhorred all Contests...And for this 
reason, mainly to avoid being baited by little Smatterers in Mathematicks, he 
told me, he designedly made his Principia abstruse; but yet so as to be 
understood by able Mathematicians, who he imagined, by comprehending his 
Demonstrations, would concurr with him in his Theory. 


Yet even the most able mathematicians of the day struggled with the Principia. 
The confident young mathematician Abraham de Moivre happened to be visiting 
the Duke of Devonshire when Newton arrived to present the Duke with a copy of 
the new work [21, p. 471]: 


[de Moivre] opened the book and deceived by its apparent simplicity per- 
suaded himself that he was going to understand it without difficulty. But he 
was surprised to find it beyond the range of his knowledge and to see himself 
obliged to admit that what he had taken for mathematics was merely the 
beginning of a long and difficult course that he had yet to undertake. He 
purchased the book, however; and since the lessons he had to give forced him 
to travel about continually, he tore out the pages in order to carry them in his 
pocket and to study them during his free time. 


Prepared by its scary reputation, we cannot conjure up the initial poise of de 
Moivre as we open the Principia, but prepared for some hard work, let us take a 
look at Newton’s argument for the Shape Theorem. Actually, to do this in the 
proper order, we should close the Principia for the moment and begin nearer the 
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beginning, returning to Halley’s call on Newton in 1684. Earlier we have read de 
Moivre’s description of their meeting [22, p. 27]: 


...after they had been some time together, the D* asked him what he 
thought the Curve would be that would be described by the Planets suppos- 
ing the force of attraction towards the Sun to be reciprocal to the square of 
their distance from it. S‘ Isaac replied immediately that it would be an 
Ellipsis. The Doctor struck with joy and amazement asked him how he knew 
it. Why saith he I have calculated it... . 


But stopping here is a rude interruption, for de Moivre continues [7, p. 283], 


... whereupon D‘ Halley asked him for his calculation without any farther 
delay, S‘ Isaac looked among his papers but could not find it, but he 
promised him to renew it, & sent it. 


It would be three months before Newton made good his promise, but idleness had 
not caused the delay, for he not only renewed his calculation for the ellipse, but 
embedded that calculation in a nine-page tract, ““De motu Corporum in gyrum” 
(“On the Motion of Bodies in Orbit”), which Halley received in November. 

It is in “De motu” then that we should look for Newton’s original demonstration 
of the Shape Theorem, that an inverse-square force implies conic orbits. Thumbing 
through its pages, we pass a line of definitions, hypotheses, theorems, corollaries, 
and problems until we stop at a familiar-looking claim [12, VI p. 49]: 


Scholium The major planets orbit, therefore, in ellipses having a focus at the 
centre of the Sun... exactly as Kepler supposed. 


The Shape Theorem (at least for ellipses)! Eagerly we anticipate the proof— 
hunched over the scholium, eyes narrowed, pencil poised—but then the adrenaline 
seeps away as we scan down the page to find...nothing. Newton has left the 
Shape Theorem, his answer to Halley, as a bald claim, completely unsupported! 
Because the scholium directly follows 


Problem 3 A body orbits in an ellipse: there is required the law of centripetal 
force tending to a focus of the ellipse. 


we would guess that Newton must have viewed the Shape Theorem as a trivial 
corollary of his solution to Problem 3, or, more generally, of what we shall call 


NEWTON’S ACCELERATION THEOREM. Every conic motion about the focus has 
an inverse-square acceleration. 


Not understanding how the Shape Theorem would follow trivially from the 
Acceleration Theorem, we turn from “De motu” to the Principia, expecting the 
fuller development there to enlighten us. 

Halley’s question in August of 1684 had reseeded Newton’s interest in celestial 
mechanics, and “De motu” was just the first little sprout. In January of 1685, he 
wrote Flamsteed, the Astronomer Royal, “Now that I am upon this subject, I 
would gladly know ye bottom of it before I publish my papers.” [7, p. 286] What 
understatement: between November of 1684 and April of 1687, Newton came to 
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“know ye bottom of it,” and the nine-page treatise exploded into a five hundred 
page masterpiece. 

Now remember that “De motu” had left the Shape Theorem unproved. And the 
1687 Principia? No better! In Section III of Book I, Newton demonstrates Proposi- 
tions XI-XIII, which, taken together, establish the Acceleration Theorem and 
then follows with the Shape Theorem dressed as a corollary [11, p. 61] to this trio 
of propositions: 


Cor. I From the three last Propositions it follows, that if a body P goes from 
place P with any velocity in the direction of any right line PR, and at the same 
time is urged by the action of a centripetal force that is inversely proportional to 
the square of the distance of the places from the center, the body will move in one 
of the conic sections, having its focus in the center of force... . 


But again, no proof. Worse yet, no one complained—not Halley, not Leibniz, not 
Huygens, not de Moivre—until, in October of 1710, twenty-three years after the 
publication of the Principia, Johann Bernoulli finally pointed out the obvious: 
Corollary I needed a demonstration. By this time, however, perhaps getting an 
early wind of Bernoulli’s criticism, Newton had already decided to fill the gap, 
instructing his editor, in a letter dated 11 October 1709, to slip the following 
argument [13, p. 5-6] into the second edition (1713) of the Principia: 


Nam datis umbilico et puncto contactus & positione tangentis, describi potest 
Sectio conica quae curvaturam datam ad punctum illud habebit. Datur autem 
curvatura ex data vi centripeta: et Orbes duo se mutuo tangentes eadem vi describi 
non possunt. 


For the third edition (1726), Newton added to this shockingly brief sketch the word 
‘velocity’ in two places, resulting in [11, p. 61] 


NEWTON’S ARGUMENT FOR THE SHAPE THEOREM 
For the focus, the point of contact, and the position of the tangent, being given, a 


conic section may be described, which at that point shall have a given curvature. 
But the curvature is given from the centripetal force and velocity of the body being 
given; and two orbits, touching one the other, cannot be described by the same 
centripetal force and the same velocity. 


Brevity may be the soul of wit, but it may be the seed of confusion as well. No 
one laughs when a fundamental proposition of celestial mechanics is followed by a 
two-sentence sketch which fails to persuade. At least Newton’s plan, although 
strikingly different from what we saw in Section 2, seems both familiar and 
clear—to prove that every solution to a given initial-value problem has a particular 
form, we exhibit a solution of that form and then invoke a _ uniqueness 
pririciple—but connecting all the dots in the outline may be another story, 
especially when some of the dots themselves are missing. 

Expanding Newton’s sketch in a natural way, we arrive at what we take as his 
intended strategy: 


NEWTON’S STRATEGY FOR PROVING THE SHAPE THEOREM 


1. Suppose given any motion r = M(t) with an inverse-square acceleration. At 
some time fj, note the position rg, velocity vg, and curvature «, of the 
motion r. 
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2. Construct a conic @, having focus at the origin, that passes through the tip 
of r, with tangent parallel to v) and curvature kp. 

3. On the conic @, put a motion r = r(t) about the focus that leaves the tip of 
r, with velocity v). (Newton never mentions this step, which involves making 
sure the position vector sweeps out area at a uniform rate, but it’s a simple 
matter, and one that he probably took for granted.) 

4. From Propositions XI—XIII (the Acceleration Theorem), infer that r = r(¢), 
a conic motion about the focus, must have an inverse-square acceleration. 

5. Thus both r and F have inverse-square accelerations, but even better, the 
matching of position, velocity, and curvature is steps (2) and (3) forces r and 
r to share the same proportionality constant. 

6. Finally, noting that r and r now both solve the same initial-value problem, 
invoke a uniqueness principle to conclude that r = r, proving that our given 
inverse-square motion F must be a conic motion about the focus as desired. 


As we begin to check whether this six-step strategy unfolds further into a 
convincing proof, we can see already that step (2) will block us, unless we know a 
little about the curvature of conics. For a motion r = r(t), the curvature « is |T|/v 
and the radius of curvature p is 1/x, where T is the unit tangent v/v. From the 
velocity and the acceleration, we can easily find the curvature from a well-known 
formula: 


p> 


= 4 
la X v| (4) 
To calculate the radius of curvature for a conic, we start with any motion 

r = r(t) satisfying the vector conic equation (2), 


r-(e+ U) =], 


p 


differentiate twice to get 


hxXr 
3 


a-(e+ U)+v- = 0, 


and insert our formula (3) for the eccentricity vector e to see that 


l hxXr 
yea (vx bh) tv 3 = 0 


Sliding the entries in the scalar triple products gives back 


1(h\? 
jaxvi=7(=), 
l\r 


vp? (7) 
ee laxv (5) 


CONIC CURVATURE LEMMA. For any conic motion with semi-latus rectum I, 
l 


Ux TP 


which leads to 


rd 


or, rephrasing, to the 


(5) 


p= 


Newton cast this lemma more elegantly [12, III p. 159]: If the line perpendicular 
to the conic at P meets the focal axis at N, then p varies as PN°. (The equivalence to 
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our lemma follows from a geometric fact about conics: the projection of PN onto 
SP is the semi-latus rectum.) This lovely property is just one of several striking 
results on curvature obtained by Newton in his 1671 tract on series and fluxions. 
“The problem [of curvature],” he wrote in this tract, “has the mark of exceptional 
elegance and of being pre-eminently useful in the science of curves.” {12, III p. 151] 
From an insight in his Waste Book made around December of 1664 (over twenty 
years before the Principia), we have evidence that Newton also recognized the 
fundamental place of curvature in the study of orbital motions: “If the body b 
moved in an Ellipsis, then its force in each point (if its motion in that point bee 
given) may be found by a tangent circle of equall crookedness [read curvature] with 
that point of the Ellipsis.” (22, p. 14] It is perhaps surprising then that curvature 
plays no role in the 1687 Principia. However, in the 1690s Newton made radical 
plans for revising the first edition, plans that would have made curvature the 
centerpiece of his celestial mechanics. Sadly, this radical revision never made it 
into print, and in the end Newton contented himself with relatively minor changes, 
Squeezing some curvature methods into the second (1713) and third (1726) editions 
as tacked on corollaries. For more on the role of curvature in Newton’s celestial 
mechanics, see (3, 4, 10, and 17]. 

Now that we know something about the curvature of conics, we can begin to 
connect all the dots in a proof of the Shape Theorem inspired by Newton’s 
two-sentence argument in the Principia. We follow the six-step strategy above, for 
it seems to be the only plausible interpretation of what Newton had in mind. 


Step 1: We give ourselves any motion r = F(t) with an inverse-square acceleration: 
for some nonzero A, suppose Fr solves the initial-value problem 


A 
R(t) = —ZU(t), (to) = to, H(t) = 


on the open interval J. If ry X vy = 0, then the motion lies on a fixed ray through 
the origin, but apart from this special case, we need to prove that r is a conic 
motion about the focus. Since r is an orbital motion, the orbit lies in a fixed plane 
and the angular momentum remains fixed at hy = rg X Vp. 


Step 2: In this fixed plane, we now construct a conic that “fits” the orbit of r. Let 
Po be the radius of curvature of r at Mt,) = ro. Put 


1 = polUy X Tol? 


where Up = ro/To, To = Vo/U, and hy = ro X Vo. (As ro and vy are not parallel, 
h, # 0 and e is well-defined.) The vector-conic equation (2) 
r-(e+U)=/ 


now defines a particular conic @. One easily checks that @ has a focus at the 
origin, and that @ passes through the tip of r, with its tangent parallel to v, and 
its radius of curvature equal to po. 


Step 3: At this point, we would like to apply Newton’s Acceleration Theorem to our 


constructed conic, but the Acceleration Theorem applies only to conic motions, 
indeed only to conic motions about the focus, not to mere conic loci. Therefore, on 
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the conic locus & we now place a motion about the focus. (To put it differently, we 
must parameterize the conic locus @ in a way that keeps the acceleration vector 
pointed at the focus.) By the Area Theorem, to make a motion about the focus, we 
need only make a motion whose position vector from the focus sweeps out area at 
a constant rate, and intuitively we can do this by arranging for the area swept out 
to be our parameter. More precisely: Using arc-length measured from the tip of rp, 
let r, =1r,(s) be the unit-speed motion on @ having initial velocity T,. The real 
function 

sl 

a(s) =t) + f —I|r.(s) X r,(s)| ds 
0 ho 


is smooth and strictly increasing. (Note that hy = Ir, X vol # 0 and Ir,(s) x r,(s)| 
+ 0 for all s, because tangents to @ never pass through the focus.) Take the 
(smooth) inverse a~' = a '(t), and use it to define a motion r = r(t) on & by 


r(t) =r [a-'(t)]. 

This constructed conic motion r is also a motion about the focus S, for it has 
constant angular momentum h, = ry X vy. Moreover, r(t,) = rp and rty) = vp. 

We haven’t done anything here, by the way, that Newton couldn’t do. You can 
find him geometrically constructing motions about the focus, on given conic loci, in 
the Principia, Book I, Section VI [11, p. 109-116]. Such constructions are even 
implicit in Newton’s proof of the Area Theorem in Propositions I and II, at the 
very beginning of the Principia. In his two-sentence argument for the Shape 
Theorem, Newton fails to mention the problem of putting an orbital motion on his 
constructed conic, but at the Principia’s level of rigor, this is a trivial omission. 
Refer to [15 and 16] for some discussion of this point. 


Step 4: We apply the Acceleration Theorem (Propositions XI—XIII, Section III, 
Book I) to r = r(t), our newly minted conic motion about the focus, and conclude 
that r has an inverse-square acceleration: for some nonzero yp, 


. LM 
r(t) = 72 U(t) 
for all tf. 


Step 5: To prove that w = A, we return to the curvature matching we did in Step 2. 
By design, both our constructed motion r and our given motion r share the same 
radius of curvature at the tip of rj, namely p,. For the conic motion r, by (4), 


Up v9 ho 
Po = — — 
a, Xv jag 
| 0 ol =U, X Vo IU, x T,| 
ro 


Similarly, for the given motion Fr, 
he 
0 


Po = ——— 
"IU, X Tol? 


It follows that uw = A. 


Step 6: We now have two solutions, the constructed conic motion r and the given 
inverse-square motion r, to the initial-value problem 


r\ 
F(t) = 73 U(4), (to) =o, F(to) = Vo 
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on the interval J. By standard uniqueness theorems (equivalent to Propositions 
XLI and XLII, Section VIII, Book I, Principia) for differential equations, we 
conclude that r = r on J, and it follows that our given inverse-square motion must 
be a conic motion about the focus, as expected. 

This completes a “Newtonian” proof of the Shape Theorem—that every motion 
having an inverse-square acceleration is a conic motion about the focus—a proof 
springing from Newton’s two-sentence argument in the Principia. Is this proof the 
contemporary version of what Newton had in mind? Probably, but the sheer 
brevity of his sketch leaves room for other views. On this issue, read [15, 16, 20, 
and 23]. 

Of course, our “completed” Newtonian demonstration is really anything but 
complete, since in step four, to ensure that our constructed conic motion had an 
inverse-square acceleration, we called on the unproved reversal of the Shape 
Theorem: 


NEWTON’S ACCELERATION THEOREM. Every conic motion about the focus has 
an inverse-square acceleration. 


We now intend to study the original argument for the Acceleration Theorem and 
then contrast the original with what we might do today, but as we return with this 
intention to the Principia (and specifically to Propositions XI, XII, and XIII in 
Book I), we must first page back to Proposition VI in order to understand how 
Newton measures orbital acceleration. 


4. In May of 1686, just one month after the Principia was presented to the Royal 
Society, Halley sent news to Newton of the plans for printing and publication, but 
his cheerful letter ended with a sour lemon [21, p. 446]: “There is one thing more I 
ought to informe you of,” he wrote, 


that M* Hook has some pretensions upon the invention of y° rule of the 
decrease of Gravity, being reciprocally as the squares of the distances from 
the Center. He sais you had the notion from him... how much of this is so, 
you know best, as likewise what you have to do in this matter, only M' Hook 
seems to expect you should make some mention of him, in the preface... . 


“Now is not this very fine?” sneered back Newton [21, p. 448], 


Mathematicians that find out, settle & do all the business must content 
themselves with being nothing but dry calculators & drudges & another that 
does nothing but pretend & grasp at all things must carry away all the 
invention... And why should I record a man for an Invention who founds his 
claim upon an error therein & on that score gives me trouble? He imagines 
he obliged me by telling me his Theory, but I thought myself disobliged by 
being upon his own mistake corrected magisterially & taught a Theory w™ 
every body knew & I had a truer notion of then himself. 


In his fury at Hooke’s pretensions, Newton struck back with his pen, literally 
striking out almost every reference to Hooke in the entire Principia. 

Even so, Hooke did in fact make one significant contribution to the Principia, 
for he was the first to see orbital motions as the geometric signature of a central 
attraction that pulls the orbiting body away from its linear inertial path. In 
November of 1679, as the new Secretary of the Royal Society, Hooke had asked 
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Newton to (22, p. 22] “please...continue your former favors to the Society by 
communicating what shall occur to you that is Philosophicall,” and he added, 


for my own part IJ shall take it as a great favor... if you will let me know your 
thoughts of [my hypothesis] of compounding the celestiall motions of the 
planets of a direct [straight] motion by the tangent & an attractive motion 
towards the centrall body. 


Hooke had this hypothesis as early as 1670, a time when Newton’s eyes were still 
clouded by thoughts of “outward endeavor” and “Cartesian vortices.” Still, Hooke’s 
physical insight could take him only so far. In his hands, the hypothesis remained 
just that: a guess, a guess rooted in physical intuition and mechanical experiment, 
yet still a guess. But in Newton’s hands, the hands of a soaring mathematical 
imagination, Hooke’s hypothesis rose to an aerie of definitions, lemmas, and 
ys \ ; 
propositions. Look, for example, at the figure Newton draws to illustrate his proof 
of Propositions I and II (Section II, Book I), where we see, for the very first time, 


the mathematical equivalence of central attraction and the area law, and you 
behold, in its central attraction and deviations from the tangent, the risen form of 
Hooke’s hypothesis. 

Later, in Proposition VI, Newton fashions from Hooke’s inward deviation a 
formula for measuring the acceleration of an orbital motion. (In the Principia, 
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accelerations for general motions are never even defined.) If a particle in orbital 
motion falls freely toward the acceleration center S, Newton may have reasoned 
that the particle could be thought of as instantaneously in free fall from the 
tangent down to its position on the orbit. In a given time ¢t, suppose a particle 
moves along its orbit from P to Q. If there had been no acceleration during this 
time interval, the particle would have proceeded instead along the tangent at 
constant speed v to a location L. The deviation QL, nearly parallel to SP, would be 
like the “distance fallen toward S,” which we would expect to be approximately 
sat’, where a gives the acceleration at P. This suggests 


as t — 0. Sanding top and bottom, Newton could now have shaped the measure 
QL/t? to fit squarely into his geometric approach. First nudge L just a bit along 
the tangent to the position R, making the deviation QR exactly parallel to SP. 


Because time varies as the area in orbital motions, replace ¢ by the area of the 
“sector” PSQ, and the sector in turn by the approximating triangle PSQ, in the 
process turning ¢ into the product SP -QT—no need to keep tabs on constant 
factors, such as the missing 1/2 here, for Newton works with proportions, not 
equations—and the measure QL/t? into QR/(SP - QT)’. The limit of this ratio, 
as Q — P, gauges the acceleration at P. In the Principia, this measure of accelera- 
tion appears as Corollary I to Proposition VI (Section II, Book I) (11, p. 48]. With 
this: corollary, Newton later derives acceleration laws from orbit shapes. 


Cor I. Jf a body P revolving about the center S describes a curved line APQ, 
which a right line ZPR touches in any point P; and from any other point Q of 
the curve, QR is drawn parallel to the distance SP, meeting the tangent in R; and 
QT is drawn perpendicular to the distance SP; the centripetal force will be 
inversely as the solid SP*- QT?/QR, if the solid be taken of that magnitude 
which it ultimately acquires when the points P and Q coincide. 
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Before we leave the topic of acceleration, we should take a moment to discuss 
the role of force and mass in the early sections of the Principia. The word ‘force’ 
appears, as it does above in Corollary I, in many of the definitions, axioms, 
corollaries, and propositions of the Principia, but in the first ten sections, where 
Newton attends to the one-body problem, force, and mass as well, exist literally in 
name only, playing no part in the mathematics. He may talk of ‘force,’ but Newton 
calculates accelerations. The Cartesians, Huygens and Leibniz among them, 
claimed that Newton, by introducing gravity, and therefore action at a distance, 
brought Aristotelian ‘occult qualities’ back into physics. But he should plead 
innocent to this charge. In the Principia’s work on orbital motions, ‘force’ and 
‘gravity’ become merely convenient words, as Newton stresses the relations and 
laws, with no comment on causes. The cause of gravity comes up only in a General 
Scholium on the final pages of the Principia [11, p. 547]: “But hitherto I have not 
been able to discover the cause of those properties of gravity from phenomena,” 
wrote Newton, 


and I frame no hypotheses; for whatever is not deduced 1s to be called an 
hypothesis; and hypotheses, whether metaphysical or physical, whether of 
occult qualities or mechanical, have no place in experimental philosophy. 
... And to us it is enough that gravity does really exist, and act according to 
the laws which we have explained, and abundantly serves to account for all 
the motions of the celestial bodies, and of our sea. 


Wouldn’t Newton, that lover of geometry and curvature, have been delighted with 
Einstein’s view that geometry, indeed the curvature of spacetime, is the very cause 
of gravity? 

After this interlude on Newton’s measure of acceleration, we remain in the past, 
looking for the original proof of the Acceleration Theorem in the Principia. 


5. Wasting no time after Corollary I to Proposition VI, Newton attacks a series of 
problems with his new measure of acceleration. In Propositions VII through XIII, 
he calculates the acceleration law for circular motions about any given point, 
semicircular motions about a point infinitely remote, spiral motions about the pole, 
elliptical motions about the center, and then, in a stately section all their own, 
elliptical, hyperbolic, and parabolic motions about the focus. Taken together, this 
final triumphant trio of propositions (XI, XII, and XIII) establishes the Accelera- 
tion Theorem: Every conic motion about the focus has an inverse-square acceleration. 

Newton could have proved the Acceleration Theorem in a single proposition 
covering general conic motions, but “... because of the dignity of the Problem...,” 
he writes, “I shall confirm the...cases by particular demonstrations.” [11, p. 57] 
These “particular demonstrations” naturally offer the same argument with minor 
variafions, so we may safely choose one of the propositions to represent all three. 
Turn then to the most celebrated page of the Principia and to Newton’s analysis for 
Proposition XI: 


PROPOSITION XI PROBLEM VI 


If a body revolves in an ellipse; it is required to find the law of the centripetal force 
tending to the focus of the ellipse. 
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In the ellipse, Newton draws conjugate diameters DK and PG, with DK parallel 
to the tangent RPZ. (The midpoints of parallel chords in an ellipse lie on a line, 
called a diameter of the ellipse, and the parallel chords are then called the 
ordinates of the diameter. Two diameters with the property that each bisects every 
chord parallel to the other are said to be conjugate diameters.) From Q he drops 
three lines: OR parallel to the focal radius SP, QT perpendicular to SP, and Qx 
completing the parallelogram QxPR. He then extends Qx until it meets PG at v 
and draws PF perpendicular to DK. 

Newton’s analysis requires the services of three lemmas, one of his own and two 
well known to Apollonius of Perga. (For the two Apollonian lemmas, see [1, I p. 15 
and VII p. 31] or [18, p. 151 and p. 169].) 


NEWTON’S LEMMA. PE = AC 


LEMMA 1. All parallelograms circumscribed about any conjugate diameters of an 
ellipse have equal area. 


LEMMA 2. In an ellipse, the squares of the ordinates of any conjugate diameter are 
proportional to the rectangles under the segments which they make on the diameter. 


As we have seen in the previous section, Newton measures the acceleration of 
an orbital motion by computing the limit of the ratio 


QR 
(SP - QT)” 
as Q > P. To infer an inverse-square acceleration for this case of elliptical motion 
about the focus, he must therefore prove that QR/QT* has a limit independent of 


P. In fact, as we now show, Newton’s argument reveals that QT?/QR tends to the 
latus rectum of the ellipse. 
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Because QR is Px and (by Newton’s Lemma) PE is AC, the similarity of the 
triangles PxV and PEC implies 
Pu- AC 
PC 
On the other hand, Newton’s Lemma (again) and the similarity of the triangles 
QxT and PEF give 


Ox: PF Ox:-BC 
~ AC CD ’ 


where the second equality follows from Lemma 1, which assures us that PF -CD = 
BC- AC. We infer 


QT? Qx’- BC’ PC 1 5 Qx?-PC 

QR CD? ~~ Pv-AC) 2 ‘Pu: CD?’ 
where we have replaced 2BC’*/AC by L. (Following Apollonius, Newton calls 
2BC’ /AC the latus rectum.) If now Q — P, this last expression has the same limit 


as 
1 vG 
— | — , 
2 PC 
for Qu/Qx tends to one and Lemma 2 implies 
Qv’ CD? 
Pv-vG PC?’ 


But vG — 2PC, so that $L(vG/PO), and thus also QT? /QR, must tend to L. This 
completes Newton’s analysis for Proposition XI: Every elliptical motion about the 
focus has an inverse-square acceleration. 


6. We have been “going under with the swirls and coming out with the eddies, 
following along the way the water goes,” but now just one quick swirl remains: to 
return from the Principia to the present, from Newton’s original work on the 
Acceleration Theorem to the delightful contrast of a contemporary argument. 

Any conic motion r = r(t) about the focus must satisfy the vector-conic equa- 
tion (2), 

r-(e+ U) =/, 

for some positive constant / and constant vector e. Since r is an orbital motion, 
h =r X vis a constant vector. Since r is a conic motion, 


l 
b= 7vxh—vU 


is a second constant vector (equal to the eccentricity vector e by (3)). Differentiat- 
ing L yields 


l hxXr 
0 = ie a xh — 3 
and taking lengths we uncover an inverse-square acceleration, 
h? 1 
“TPR 


proving again 


NEWTON’S ACCELERATION THEOREM. Every conic motion about the focus has 
an inverse-square acceleration. 
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Applications of Linear Algebra in Calculus 


Jack W. Rogers, Jr. 


1. INTRODUCTION. The concepts of basis, matrix for a linear transformation 
relative to bases, and change-of-basis matrix are fundamental in linear algebra, but 
students in an introductory class often have trouble understanding the point of 
applying these concepts for bases other than the standard basis for R”. Our object 
is to illustrate some applications of these concepts in solving problems with which 
students who have recently completed the calculus sequence should be familiar. 
The spaces are abstract vector spaces—finite subspaces of function spaces—not 
simply subspaces of RR”; they have no obvious natural basis. We see that standard 
linear algebra techniques, such as matrix inversion, can be applied in place of the 
usual calculus techniques of substitution or integration by parts. The students may 
judge for themselves the relative difficulty of calculus methods vs. linear algebra 
methods—and the understanding that each provides—for these types of problems. 
This is not to deny the fundamental importance of substitution or integration by 
parts in calculus. Students are assumed to have mastered these techniques in their 
calculus courses and to be familiar with the problems to which they are applied. 
These problems can then be used to motivate new ideas in linear algebra. 


2. PRELIMINARIES. We adopt the following conventions. All vectors spaces are 
over R. Any sequence @ = (b,,...,b,) of vectors in a vector space V determines 
a linear transformation L 4: R* — V defined by 


xy 
L g(x) = Lg =x,b, +: +x,b,. 
XK 
If & = (b,,...,b,) is a basis for V, then Lz is both injective and surjective. Thus, 
for every vector x € V there is a unique vector c € R* such that x = L. gc. The 


coordinates of c,c,,...,c,, are called the coordinates of x relative to the basis &, 
and we write 


C; 


[Ix]g =c= 


This defines the linear coordinate transformation, [:]z: V > R*, which is the 
inverse of L 9. 


3. MATRICES FOR DIFFERENTIATION. Suppose that U and V are linear 
spaces with bases @ = (b,,...,b,) and @ = (c,,...,c,,), respectively, and T: 
U — V is linear. Suppose x = x,b, + --- +x,b, © U. For i = 1,...,”,T(b,) has a 
coordinate vector relative to @, denoted by [T(b,)],. These vectors form the 
columns of the matrix for T relative to & and @, which we denote by M,;.¢~ g, and 
which transforms the &@-coordinates of any vector in U into the @-coordinates of 
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its image under 7, as we see by using the linearity of T and of the coordinate 
transformation [-], to obtain 


IT(x)l¢ _ [ x,T(b,) rt +x,T(b,)] ¢ = x,[T(b,)]¢ to +x,|T(b,)]¢ 


x} 


— [{[T(b le -_ [T(b,)] ¢] _ Mr. ¢ alxlg- 


Xp 


The direction of the arrow, opposite to that in T: U — V, is chosen to preserve the 
order of the subscripts in the equation [T@)]z = Mr. ¢~ alxlg.- 


3.1. An application to antidifferentiation. We consider differentiation as a linear 
transformation 2: C'(R) — C(R) from the space of continuously differentiable 
functions f: R — R to the continuous functions. If @ = (b,,...,b,,) is a finite 
sequence of vectors in C'(R) such that D leaves S = span # invariant, i.e., 
D(S) cS, then the derivative restricted to § can be represented by an n Xn 
matrix relative to the basis @ alone. As a simple example, suppose @ = @ = 
(sin t,cost) and U = V = span &. Then 


LD>(bdle= [D(sin t)] g = [0-sint + 1-cost]g = H 


[D(b2)] ze = [D(cost)] g = [-1-sint + 0-cost]g = Ol 


and the matrix D=M)>,. g~ g for ® relative to @ is D= k me This is the 


matrix for a rotation of the plane 90° counterclockwise, which gives a geometric 
picture for the cycle of derivatives sint > cost > —sint+> —cost + sint. 


The matrix D is invertible, and D~' = | - 5]: Since the inverse of differenti- 


ation is integration, the columns of D7! 


elements: 


[sin tat -L.(|_9]] = -—cost and feos rat = Lo([ 5] = —sint. 


Of course, there are other antiderivatives for the sine and cosine, differing from 
these antiderivatives by constants. They do not appear because the space of 
constant functions, which forms the kernel of the derivative transformation, 
intersects § only at the origin. Thus, relative to &, antidifferentiation is unique. 

Now consider / t*e' dt. This is a typical integration by parts problem, requiring 
two applications to complete the integration. Instead, we look for a space contain- 
ing ¢*e' that is invariant under differentiation. By successive differentiation of t7e’, 
we find one such space, & = (te', te’, e'), and we have 


must represent antiderivatives of the basis 


1 
[Dcee')] , = [#e! + 2ie'] g = | 2 
0 


) 
[Dcte')| 2 = [te +e']g=|1 
1 


0 
[Dce')| 2 = [e']g= , 
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For this problem, the matrix D = M». g~ g for ® relative to & is 


1 0 O ] 0 0 
D={2 1 O|] and D'=]-2 1 Oj}. 
0 1 1 2 -1 1 


The single matrix inversion provides all the following antiderivatives, with coeffi- 
cients given by the columns of D™'. 


[ret ax = t*e’ — 2te' + 2e’, fre! dx = te’ — e', and feta =e', 


Matrix inversion has replaced the use of integration by parts for this problem. 

The same technique can be used to provide an alternative approach for many 
standard integration by parts problems. For /{tsin tdx, for example, let @ = 
(t sin t, cos t, sin t, cos t); for fe’ sin tdt, a trickier integration by parts problem, 
let & = (e' sin t, e' cos ¢). 


3.2. An application to linear differential equations. Suppose f © C” and consider 
the nonhomogeneous differential equation 
L(y) =a,y ++ +ayy =f. (3.1) 


This is represented in matrix form as 
Ll y]@ = (4,D" + +a!) [yl a = [fla- 
Assuming [f] is in the image of L(D), this yields the coordinates of a particular 
solution for (3.1). For example, consider 
y’ +y’ +y=sint. 
Using the basis @ = (sin t, cos t), we have [sin t], = A and 
_» _f[-1 0o],fo -1 1 0] _fo -1 
p=p+p+r~-{~-) 9)+|% ~df+{p f]-[f oO]: 
The solution of Lly]g = A is ly], = | | = [-cost],, yielding y = —cost asa 


particular solution for the differential equation. 

This is related to the method of undetermined coefficients for finding a 
particular solution [2]; using the matrix for the derivative simplifies the computa- 
tions. 


4. CHANGE-OF-BASIS AND CERTAIN TRIGONOMETRIC INTEGRALS. We 
turn now to an application for the change-of-basis matrix. 

Suppose # and @ are two bases for the same k-dimensional vector space, V. 
The change from @-coordinates to @-coordinates is a linear transformation, with 
an associated matrix Pz. g Satisfying 


[x]e = Pe glx). 
It is the matrix for the identity transformation relative to the bases @ and @: 
Peg =M.¢-4@= [[bi Je _ [b, Je]. 
4.1. Integrating powers of cos x. The integration of cos” ¢ is usually accomplished 
at the calculus level by substitution if m is odd, and by application of the 
double-angle formula cos 2t = 2cos* t — 1, repeated as necessary, if m is even. An 


alternative, considering this as a change-of-basis problem in linear algebra, pro- 
vides a unified approach to integrating any polynomial in cos ¢. 
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Let @,,, = (1,cost,...,cos nt). There are several ways to show that these 
functions are independent; here, we simply make that assumption. Thus &,,, is a 
basis for the span, S,,,, of its terms. 

Adding cos(k + 1)t and cos(k — 1)t and regrouping yields the recursion 


cos(k + 1)t = 2cos kt cost — cos(k — 1)t. 
For the first few values of k, we have 
cos Qt = 1 
cos lt = cost 


cos 2t = 2(cos 1t)cos ¢ — cosOt = —1 + 2cos*t 
cos 3t = 2(cos2t)cos t — cost = 2(—1 + 2cos* t)cos t — cost 


= —3cost+ 4cos°’t 
cos 4t = 1 — 8cos’t + 8cos‘t 
cos 5t = Scost — 20cos* t + 16cos° t 
: (4.2) 


In general, this recursion expresses elements of &@,,, as Chebyshev polynomials 
(see Section 5.1) in cost, i.e., as linear combinations of the elements of &,,, = 
(1,cost,...,cos”¢). Since @,,, iS a sequence of m +1 vectors spanning the 
(n + 1)-dimensional space S,,,,%,4, is also a basis for S,,,. The equations (4.2) 
show how to form the change-of-basis matrix Pz. g (we suppress subscripts and 
superscripts for the bases unless needed for clarity). The inverse change-of-basis 
matrix, Pz. g, converts a polynomial in cos ft, which is difficult to integrate, into a 
linear combination of terms of the form cos nt, which is easy to integrate. 
For example, for n = 2, equations (4.2) show that 


10 -1 10 4 
Pp -g@=|0 1 £40] and Py. ¢g=Pzig=10 1 0 
00 2 0 0 3 


Thus, for an arbitrary second-degree polynomial p in cost, we have 


[ ple = [eo + ey cost + c, cos* t] 4 = [cp C c]” 


T 
= [pla = Pe-elPle = | co + 3¢, ¢ 3 | 
= p = (Cy + 3€,) +c, cost + 5c, cos 2t, 

which is easily integrated. 


4.2. The case for even n. We have seen that polynomials in cos ¢ can be integrated 
using an appropriate change of basis. Integrating cos” t for odd n can be done by 
substitution, so we can focus our attention on the even powers. This problem can 
also be solved using a change-of-basis matrix—one that is half the size of the one 
we must deal with if we include the odd powers. 

Adding the expressions for cos(k + 2)t and cos(k — 2)t yields, for k > 1, 


cos(k + 2)t = 2(cos kt)(cos2t) — cos(k — 2)t 
= 2(cos kt)(—1 + 2cos*t) — cos(k — 2)t. 
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Thus, 
cos Ot = 1 
cos2t = —1+ 2cos*t 
cos 4t = 2(cos2t)(—1 + 2cos?t) — cosOt = 2(-1 + 2cos?t) — 1 
= 1 — 8cos’t + 8cos*t 
cos 6t = 2(1 — 8cos?t + 8cos*t)(—1 + 2cos*t) — (—1 + 2cos? t) 
—1 + 18cos* t — 48cos* t + 32cos° t 


(4.3) 


Continued application of the recursion formula yields, for every k, an expression 
for cos2kt as a polynomial in even powers of cos ¢t. Consequently, the sequences 
@(), = (1,cos’ t,...,cos’” t) and A), = (1,cos2t,...,cos2nt) are bases for the 
same (n + 1)-dimensional subspace of S,,,,, and the change-of-basis matrix 
provides an organized method for integrating even polynomials in cos x, including, 
as a special case, cos?” f. 


For example, for { cos® tdt, n = 3, and we have 


1 -1 1 —] 
Pp _|0 2 -8 18 
aria 0 0 8 —48 
0 0 0 32 
and 
1b a % 
> pr [9 2 4 3 
-& C- B 0 0 4 = 
000 3 


The coefficients of cos®t are in the last column of this matrix, cos°t = 4 + 
cos 2t + +cos4t + =cos6t, from which the integral is easily obtained. The 
formula used here is a typical three-term recursion, i.e., except for cos t, which 
does not depend on k, the value of cos(k + 2)t depends only on two earlier even 
values, cos kt and cos(k — 2)t. 

For comparison, the standard calculus approach uses the double-angle formula 
cos?* t = (cos? t)* = [$(1 + cos2z)]*, integrating constants and odd powers by 
substitution as they occur, until k is reduced to 1 or 0. More formally, suppose all 
even powers of cos”! t, for i < k, have been expressed in terms of odd powers of 
cos jt. Then the binomial formula yields 


1 ~ LAs | 
cos** t = —[(1 + cos2t)|"=— ) | eos 2t, 
4 4 j= \! 
and the even powers in this expression can be replaced by the expressions already 
calculated. For n = 6, this algorithm yields 
cos’ t = 2 + 2cos2t + 4cos4t + cos? 2¢. 


This requires all the earlier even powers of cos jt, not just the previous two, and 
generates a more complicated basis; it does, however, avoid inversion. 


4.3. Integrating powers of sin x. For powers of the sine function, we let %,,, = 
(1,sin¢,...,sin”¢), and take 9,,, = (1, sint,cos2t,...,cos nt) if m is even, and 
J9,., =U, sin t,cos2t,..., sin nt) if n is odd. We need two recursion formulas. 
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Subtracting cos(k — 1)t from cos(k + 1)t yields 
cos(k + 1)t = —2sin kt sin t + cos(k — 1)t. 
Subtracting sin(k — 1)t from sin(k + 1)t yields 
sin(k + 1)t = 2cos kt sint + sin(k — 1)t. 
Using the first equation for odd k and the second for even k, we obtain, by 


induction, expressions for the elements of M,,, in terms of those in &,,,, and 
hence the components of P, ~~. The first few are 


cos Ot = 1 
sin lt = sint 
cos 2¢ = cosOr — 2sin¢tsint = 1 — 2sin’t 
sin 3¢ = 2cos2t sint + sint = 2(1 — 2sin*r)sin¢ + sint = 3sint — 4sin°t 
cos4t = 1 — 8sin?t + 8sin*t 
sin 5¢ = 5sint — 20sin°t + 16sin° ¢ 
(4.4) 


There is an obvious similarity between these formulas and those for the powers of 
the cosine, which can be explained using the basic definition of the Chebyshev 
polynomials (see Section 5.1). 

For n = 2, these equations show that 


1 0 1 1 0 : 
Pyeg=|0 1 0] and P,.g=Pei,=|0 1 0 
0 0 -2 0 0 —-34 


Thus, for an arbitrary second-degree polynomial p in sin t, we have 


T 
[ple = [co +e, sine + ¢, sin? t] 2 = | c C cy 


T 
=\[pla=P -elPle = [eo + 2¢2 Cy — 3¢9| 
=> p = (Cy + 3€,) +c, sint — 3c, cos 2t, 


which is easily integrated. 

As these formulas indicate, it follows by induction that (1, sin’ t,..., sin?” ¢) and 
(1,cos2t,...,cos2nt) span the same space, as do (sint,...,sin°”~'t) and 
(sin t,...,sin(2n + 1)t). 


5. CONNECTIONS. The ideas discussed so far are related to some other topics, 
which we briefly discuss in this section. 


5.1. Chebyshev polynomials. As indicated by (4.3), cos mt can be expressed as a 
polynomial in cost for each n > 0. The polynomial itself is defined as 7,(x) = 
cos(m arccos x) for x € [—-1,1], so that, for t¢ € [0, 7], T,(cos t) = 
cos(m arccos cos t) = cos nt. These polynomials, called Chebyshev polynomials, find 
application in approximation theory [1]. To see a relationship between the polyno- 
mials in (4.3) and those in (4.4), we consider, for t € [— 5,5] and n = 0, 


nr 
T,,(sin t) = cos(7 arccos sin t) = cos( — nt}, 
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yielding the following equations for k > 0, and explaining the relationship. 
T>,(sint) = (- 1)* cos 2kt T>,4,(sint) = (- 1)* sin(2k + 1)t 


5.2. Finite Fourier series. The Fourier series of order n is 


n n 
a, + Yia;sinit+ >) 5, cosit, 
i=] i=1 
and integrating such an expression is straightforward. The approach in Section 4 
can be thought of as expressing polynomials in sin, or polynomials in cos, as finite 
Fourier series, and then integrating. It is in fact true that any polynomial in sin and 
COS, Li? 5-9 4; sin’ tcos/t, has a finite Fourier series. It is sufficient to consider 
sin’ t cos/ t. If i is even, the identity sin? t + cos? t = 1 can be used to convert this 
to a polynomial in cos t, which we have already seen to have a finite Fourier series. 
If i is odd, the same identity converts sin’ t cos’ t to the form (sin t) p(cos t), where 
p is a polynomial. Hence, it suffices to show that sint cos” t has a finite Fourier 
series for each n. 
Adding sin(k — 1)t and sin(k + 1)t yields the recursion 


sin(n + 1)t = 2sin nt cost — sin(n — 1)t, 
and clearly sin Ot and sin 1¢ are of the form (sin t)p(cos t). Applying the recursion 
and induction, we see that sin nt can be expressed in this form for all n. The first 
few expressions are 
sin Ot = 0 
sin 1¢ = sint 
sin 2t = 2(sint)cost — (0) = 2sintcost 


sin3t = 2(2sin t cost)cost — (sint) = sint(—1 + 4cos*r) 
sin4t = sin t(—4cost + 8cos*r) 
sin5t = sin t(—1 — 12cos* t + 16cos* rt) 
Thus, (sin ¢,..., sin mt) and (sin ¢t, sin t cos t,..., sin ¢ cos” t) are bases for the same 


space, so that sin tcos”t can be expressed as a finite Fourier series using only 
sines, which can be obtained, as before, by matrix inversion. This completes the 
argument. 


REFERENCES 


1. R.L. Burden and J. D. Faires, Numerical Analysis, PWS-Kent, Boston, fourth ed., 1989. 
2. D.A. Sanchez, R. C. Allen, Jr. and W. T. Kyner, Differential Equations, Addison-Wesley, Reading, 
second ed., 1988. 


Mathematics Department 


Auburn University, Alabama 36849 
Jrogers@mail.auburn.edu 


26 APPLICATIONS OF LINEAR ALGEBRA IN CALCULUS [January 


Periodicity, Quasiperiodicity, 
and Bieberbach’s Theorem 
on Crystallographic Groups 


A. Vince 


1. INTRODUCTION. This article contains an elementary proof of a fundamental 
geometric theorem of Bieberbach. Moreover, it affords the opportunity to digress 
onto subjects that motivated the proof—periodicity and quasiperiodicity. The 
proof is in Sections 5 and 6. Most of the article consists of observations on 
isometries of Euclidean space (Section 2), crystallographic groups (Section 3), and 
the role of Bieberbach’s theorem in the theory of crystals and quasicrystals (Section 
4). 

A crystallographic group is a discrete, cocompact group of isometries of n- 
dimensional Euclidean space. All terms in this definition are explained in Section 
3. For now, it suffices to say that the two-dimensional crystallographic groups, 
often called wallpaper groups, are familiar as symmetry groups of tilings of the 
plane, and the three-dimensional groups arise as symmetry groups of crystals. 
There are exactly 2 one-dimensional, 17 two-dimensional, and 230 three-dimen- 
sional crystallographic groups. In dimension four there are 4,783 crystallographic 
groups [2]; this enumeration relys heavily on the computer. The exact number in 
higher dimensions is unknown. The eighteenth of Hilbert’s famous problems posed 
at the 1900 International Congress of Mathematicians asks, in part, whether the 
number of crystallographic groups is finite in all dimensions. An affirmative answer 
was provided by Bieberbach [1] in papers that appeared in 1911 and 1912. The 
two- and three-dimensional crystallographic groups were first classified in the 
1890’s by Fedorov [7] and, independently, by Schoenflies [14]. The classification of 
the three-dimensional crystallographic groups can be found in many texts on 
mathematical crystallography, but these texts usually assume the following result. 
This same result is the main step in Bieberbach’s solution of Hilbert’s eighteenth 
problem. 


Theorem 1 (Bieberbach). Jf G is an n-dimensional crystallographic group, then G 
contains translations in n linearly independent directions. 


Bieberbach’s proof of Theorem 1 [1] depends on a nontrivial number theoretic 
résult concerning the approximation of irrational numbers by rationals. More 
recent treatments by Wolf [17] and Charlap [5], also somewhat technical, are based 
on a proof of Frobenius [8] that appeared shortly after Bieberbach’s proof. A 
shorter proof by P. Buser [4] was a result of his study of Gromov’s work on almost 
flat manifolds. Gromov, in turn, has stated that his work on almost flat manifolds 
resulted from an attempt to understand the Bieberbach theorem [5]. Our proof is 
intended to be accessible to anyone with a basic undergraduate knowledge of 
abstract and linear algebra. 
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The concept that plays the central role in the proof is what we call the axis of an 
isometry g, the largest subspace of RK” on which g acts as a pure translation. A 
main step in the proof, a result also proved by Buser [4], is an analog in R” of the 
well known Crystallographic Restriction in R* and R°. 


2. ISOMETRIES. An isometry is a mapping of R” onto itself that preserves 
distance. The following representation of an isometry is well known and very easy 
to prove given the fact that an isometry with a fixed point is an orthogonal 
transformation. Given any point p € R”, an isometry g can be expressed as the 
composition of an orthogonal transformation A, centered at p, and a translation: 


(2.1) g(x) =Axta. 


The orthogonal map A will be referred to as the rotational part and translation by 
a the translational part of g. The rotational part is, up to conjugacy, independent of 
the point p. The main result in this section is a refinement of (2.1), obtained by 
making an appropriate choice of the origin p. 


Lemma 1. Let g be an isometry of KR". There exists a unique affine subspace F 
satisfying the following properties: (a) g is a translation when restricted to F, and (b) F 
is maximal with respect to property (a). Moreover, if the origin is chosen to lie in F, 
then 


g(x) =Ox+gq, 
where Q is orthogonal, F is the set of fixed points of Q, and q € F. 


The subspace F of Lemma 1 will be called the axis of g and denoted axis(g). 
As an application of Lemma 1, consider any isometry g of R°. If axis(g) = R’®, 
then, according to Lemma 1, g is a translation. If axis(g) is a plane 7, then g is 
the composition of a reflection through 7 and a translation in a direction along 7. 
Such an isometry is called a glide reflection (or a reflection if the translation is the 
identity). If axis(g) is a line /, then g is the composition of a non-identity rotation 
and a translation along /. Such an isometry is called a screw displacement (or a 
rotation if the translation is the identity). Finally, if axis(g) is a point p, then g is 
an orthogonal transformation having only p as fixed point. Such an orthogonal 
transformation has canonical form 


cos@ -—sin @ 0 
sin 0 cos 0 0 |, 
0 0 —] 


which is the composition of a non-identity rotation about a line / with a reflection 
in a plane perpendicular to /. Such an isometry is called a rotary reflection. Thus 
Lemma 1 provides the following classification: every 3-dimensional isometry is a 
transldtion, rotation, reflection, glide, screw or rotary reflection. 


Proof of Lemma. As in (2.1), write g(x) = Ax + a. Let V be the subspace of fixed 
points of A and V~ the orthogonal complement of V. Note that both V and V * 
are invariant under A. Let g and q~ be the components of a in the subspaces V 
and V~ , respectively. Since J —A is nonsingular when restricted to V+, the 
affine subspace F = (I — A)~'q* is not empty. For x € F we have Ax =x —q?', 
which implies that g(x) = Ax +a =x+(a-—q~)=x+4q &F. Therefore g isa 
translation when restricted to F. Define Qx = Ax + q~ . Then F is the set of fixed 
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points of Q; Q is orthogonal because it has a fixed point; and g(x) = Qx + q. We 
leave to the reader the routine exercise of showing that F is unique, i.e., that there 
does not exist even a one dimensional subspace, not contained in F, upon which G 
acts as a translation. = 


3. CRYSTALLOGRAPHIC GROUPS, DELAUNAY SETS AND VORONOI 
TILINGS. An n-dimensional crystallographic group G 1s a discrete, cocompact 
subgroup of isometries of R”. Discrete means that any ball contains at most finitely 
many points in the G-orbit of any point. Cocompact means that the quotient space 
R”/G is compact, where the quotient is the set of orbits with the quotient 
topology. A less abstract, but equivalent, definition of crystallographic group is 
more appropriate for our purpose. A set X of points of R” is called an (7, R)- 
Delaunay set, or simply Delaunay set, if 


(1) X is discrete: there is a number r such that every ball of radius r centered 
at a point of X contains no other points of X. 

(2) X is uniform: there is a number R such that every ball of radius R contains 
a point of X. 


Let G be a group of isometries of R” and p any point of R”. Then G is a 
crystallographic group if and only if the orbit of p is a Delaunay set. This can be 
restated in terms of Voronoi tilings as follows. Let P be the orbit of any point 
under the action of a group G of isometries of R”. For any p € P, let D, denote 
the Voronoi region of p. This is the set of points at least as close to p as to any 
other point of P: 


D, = {x © R": |x — p| < |x — y| forall y € P}. 


The Voronoi region D, is the intersection of half space determined by the 
perpendicular bisectors of the line segments joining p to each of the other points 
of P. The group G is a crystallographic group if and only if each Voronoi region 
{D,|p € P} is a bounded convex polytope. In particular, the Voronoi regions of 
any orbit of a crystallographic group tile R”; all the tiles are congruent; G acts 
transitively on these tiles; and the action of g@G on a single tile completely 
determines g. 

This definition makes it easy to prove a first approximation to Bieberbach’s 
theorem, a result Buser [4] calls Mini-Bieberbach. It states that an n-dimensional 
crystallographic group must contain n isometries that are nearly translations, in 
the sense that the translational parts are linearly independent and the rotational 
parts are close to the identity. As a measure of the proximity of the rotational part 
of an isometry g to the identity, define 


|Ax — x| 
rot(g) = max a 


? 


where A is the rotational part of g. 


Lemma 2 (Mini-Bieberbach). Let G be an n-dimensional crystallographic group. 
Given any point p and any «> 0, there exist in G elements g(x) = Ax +a, 
i= 1,2,...,n, centered at p, such that 


(1) rot(g;)<eforalli and (2) {a,,4a),...,a,} is linearly independent. 


Proof: Consider the Voronoi tiling with respect to the orbit of the point p. 
Further, let b be an arbitrary direction and consider the sequence {D,} of tiles that 
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intersect the ray with endpoint p and direction b. Let g,(x) = A;x + a; be an 
element of G with rotational part A, centered at p and such that g; takes Dy to - 
D,, where Dy is the tile centered at p. Because the orthogonal group is compact, 
{g,} has a convergent subsequence. For us this means that there must exist two 
isometries g; and g, with the properties: (1) A; and A, are sufficiently close in 
the sense that rot(g, ° g7') < «, and (2) |a, — a,| is sufficiently large so that the 
angle between b and the vector from g,(p) to g,(p) is less than ¢. Then the 
element g = g, ° gj satisfies statement (1) in the lemma. The lemma follows by 
repeating this argument where, at the k‘ stage, b is chosen orthogonal to the 
subspace spanned by a), a),..., d,_}- a 


4. CRYSTALS AND QUASICRYSTALS. With the atoms and molecules of a real 
crystal in mind, define a crystal as the image of a finite number of points of R” 
under a group generated by 7 linearly independent translations. The symmetry 
group, sym(X ), of a crystal X is the group of isometries that leave the points of X 
as a whole invariant. A crystal clearly has the following properties: X is discrete; X 
is periodic, which means that sym(CX ) contains translations in n linearly indepen- 
dent directions; X is the union of finitely many lattices, a /attice being the image of 
a single point under a group generated by n linearly independent translations. 

The notions, crystal and crystallographic group, are intimately related, as 
described in Theorem 2. Although it would be surprising if it were otherwise, this 
theorem is illustrative for a couple of reasons. First, it is another consequence of 
Bieberbach’s theorem. Second, the proof uses two essential ingredients in crystallo- 
graphic analysis, the translation group and the point group. Let g be an element of 
a crystallographic group G and let p € R”. Consider the representation (2.1): 
g(x) = Ax + a, with respect to p. The mapping ¢: g >A induces a homomor- 
phism of G into the orthogonal group. The kernel of @ is the translation subgroup 
T of G; the image of @ is the point group at p. 


Theorem 2. A set X of points in Euclidean space is a crystal if and only if X is discrete 
and sym(X ) is a crystallographic group. 


Proof: Assume that X is a crystal, G its symmetry group, and T the subgroup of G 
generated by the nm independent translations that define X. If p © X then G(p), 
the orbit of p, is the union of finitely many lattices since G(p) is invariant under 
T. Therefore, G(p) is a Delaunay set, so G is a crystallographic group. 

In the other direction, let G be the symmetry group of X, and let TJ be the 
translation subgroup of G, which, by Bieberbach’s theorem, is generated by n 
independent translations. Let p € X and let L be the lattice that is the image of p 
under the action of T. From the fact that 7 is normal in G, it is easy to show that 
L is invariant under the action of the point group ® at p and that, for any A € ©, 
A is completely determined by its action on finitely many points of L. Then @, 
being a group of permutations of these points, is finite. Being isomorphic to ®, the 
quotient G/T is also finite. Express G = U; 7g; as the disjoint union of finitely 
many cosets of 7. Denoting by D, the Voronoi region at p, there are finitely many 
points in X,=XNMD, because X is discrete. Now we have X = G(X,) = 
(U; Tg X,) = TLU; g (XI Therefore X is the image of finitely many points 
under the action of the translation subgroup 7. By Theorem 1, 7 contains 
translations in 7 linearly independent directions, so, by definition, X is a crystal. 

| 
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It was actually quasicrystals, rather than crystals, that drew our attention to 
Bieberbach’s result. A decade ago Shechtman, Blech, Gratias, and Cahn [16] 
discovered the first “quasicrystal,” an alloy of aluminum and managanese whose 
electron diffraction pattern consisted of sharp spots exhibiting a 5-fold symmetry. 
This elicited great excitement in solid-state science for the following reason. A 
distinct diffraction pattern with sharp spots, called Bragg peaks, is evidence of 
“Jong range order,” which, until that time, meant a crystal structure. On the other 
hand, the well known Cnystallographic Restriction states that the only rotational 
symmetry possible for a crystal in two or three dimensions is 2, 3, 4, or 6-fold 
symmetry. In other words, a crystal structure and the observed 5-fold symmetry are 
incompatible. Since this original discovery, various similar materials (aluminum- 
lithium-copper, uranium-palladium-silicon, and other compositions) have been 
discovered and analyzed, and the consensus among solid-state scientists is that 
these materials cannot be explained within the framework of a periodic structure, 
that they are truly new. The “long range order” in quasicrystals, whatever is 
causing the Bragg peaks in the electron diffraction, is often referred to as 
“quasiperiodicity.” 

In any study of quasiperiodicity, a minimum that should be required of a set X 
of points is that X be a Delaunay set. However, this alone implies little about 
global order, an example being the molecules of a gas in a closed container. 
Senechal and Taylor [15] inquire about the consequences of requiring the following 
additional local congruence property. For x © X and real number ap, let N,(x) 
denote the intersection of X with the ball of radius p centered at x. 


Property N,: For any two points x, y © X, the neighborhoods N,(x) and N,(y) are 
congruent by a congruence taking x to y. 


Unfortunately, as Senechal and Taylor point out, a theory based on the local 
regularity Property N, will not be interesting because it already implies that X is a 
crystal. 


Theorem 3. Let X be an (r, R)-Delaunay set in R". There exists a number p, 
depending only on r, R, and n, such that if property N, holds, then X is a crystal. 


Theorem 3 is again a consequence of Bieberbach’s theorem. The proof of 
Theorem 3 is in two parts. First, in 1976 Delaunay and his colleagues [6] gave an 
elegant proof that, under the conditions of Theorem 3, the symmetry group G of 
X acts transitively. Since the orbit X of G is a Delaunay set, G is a crystallo- 
graphic group. Theorem 3 now follows directly from Theorem 2, which, in turn, 
was a consequence of Bieberbach’s theorem. 

Theorem 3 implies that any investigation into quasiperiodicity requires ideas 
more subtle than the local homogeneity given by property N,. Advances in this 
direction have been made by Penrose [12], de Bruijn [3], Kramer and Neri [10], 
Katz and Duneau [9], Mozes [11], Radin [13], and many others, but these results lie 
outside the scope of this note. 


5. CONJUGACY IN A CRYSTALLOGRAPHIC GROUP. Very informally, to say 
that two isometries of Euclidean space are conjugate means that they do the same 
thing, but in different places. In R°, for example, the conjugate kgk™! of a 
(7/2)-rotation g about a line / is a (7/2)-rotation about the image line k(/). 
Lemma 3 is a more formal statement. The notation is as follows. Let g be an 
isometry; use Lemma 1 to express it in the form g(x) = Qx + q, where Q is 
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orthogonal and q € axis(g). Define trans(g) = q, where trans(g) is considered as a 
free vector so, in statement (2) of Lemma 3, k maps both the initial and terminal 
point of the vector. 


Lemma 3. If g and k are isometries and h = kgk~' then 


(1) axis(h) = k(axis(g)) (3) rot(h) = rot(g) 
(2) trans(h) = k(trans(g)) (4) rot(hg~') < 2rot(k)rot(g). 


Proof: The first three statements are routine to verify. The following proof of 
statement (4) is due to Buser [4]. Let A and B be the orthogonal parts of g and k, 
respectively, centered at the same point. Then BAB™'A' —I = ((B —I)(A —- I) 
— (A —I)(B —1))B™'A7! and, since |B~'A~'x] = |x|, it follows that 


rot(kgk~'g~') = rot(BAB~'A~') < 2rot(B)rot( A) = 2rot(k)rot(g). & 


Although somewhat technical, the next lemma is essential to the proof of 
Bieberbach’s Theorem. A rough sketch of how it comes into play is as follows. Let 
G be a crystallographic group. Mini-Bieberbach (Lemma 2) implies the existence 
of m isometries with translational parts in independent directions and with rota- 
tional parts that are close to the identity. To prove Bieberbach’s theorem it 
remains to show only that each such isometry g must necessarily be a translation. 
Assume the contrary, that g is not a translation. Under this assumption, a certain 
set C of conjugates of g, each distinct from g, is not empty. Lemma 4 is used to 
prove that axis(g) and axis(g) are not too close to each other if g € C. So among 
the isometries in C, let h have axis closest to the axis of g. Then it can be shown 
that axis(hgh~') is even closer to axis(g) than is axis(h), a contradiction if 
hgh~' © C. Lemma 4 is required again to show that hgh™' © C. The complete 
proof appears in Section 6. 

As apparent from this outline, the minimum distance between the axes of two 
isometries g and h plays a crucial role. We use the notation 


d(g,h) = min{|x — yl: x € axis(g), y € axis(h)}. 


Lemma 4. If g is an element of a crystallographic group, then there exist positive 
numbers 6 and c with the following property. Let h = kgk~' be a conjugate of g. If 
rot(k) < 6 and either 


(1) d(g,h) sc or (2) hgh’ =8, 
then h = g. 


Proof: Let p and p be closest points on axis(g) and axis(h), respectively, and 
consider the Voronoi tiling with respect to the orbit of p. Choose c small enough 
so that, if d(g, h) < c, then p lies in the interior of tile D,. Choose 6 < V2 and, in 
addition, small enough so that both of the following conditions are satisfied. 


(1) If p lies in the interior of tile D, and rot(k) < 6, then g(D,)  h(D,) # ©. 
This is possible due to statement (2) of Lemma 3. 

(2) If f(D,) = D, and rot(f) < 46, then f must act as the identity on D,. This 
is possible because D, is a bounded polytope with finite symmetry group. 


Now assume that rot(k) < 6 and d(g,h) < c. By statement (1) we have g(D,) M 
h(D,) # ©, which implies that g(D,) = A(D,) and g~'h(D,) = D,. By parts (2) 
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and (4) of Lemma 3, rot(g~'h) = rot(hg™') < 2rot(k)rot(g) < 4rot(k) < 46. So by 
condition (2), with f = g~'h, the isometry g~‘h acts as the identity on D,. Since 
an element of G is determined by its action on D,, we have h = g. 

Next assume that g, := hgh™' = g. We claim that d(g,h) = 0, in which case 
h = g follows from what has already been proved. To prove the claim, express 
h(x) = Qx + q as in Lemma 1, where Q is orthogonal and g € V := axis(h). 
Taking the center of Q as the origin, let W + a be the axis of g, where W is a 
linear subspace of R”. Using statement (1) of Lemma 3, O(W)+ Qa+q= 
h(W + a) = h(axis(g)) = axis(g)) = axis(g) = W + a. This implies both (a) Q(W) 
=W and (b) JU-Q)aeqt+W. But V = axvis(h) = k(axis(g)) = k(a + W), 
which, by the same reasoning as above, implies (c) V = A(W), where A is the 
rotational part of k centered at the origin. We next prove, by contradiction, that 
W = V. Since subspaces V and W have the same dimension, assume that there 
exists aw © W\ V. Let w=v+v0~, where v € V and v- € V~ and let x = w 
— Qw. Then x € W because of statement (a), and x © V~ because x = (v + v*) 
— (Qu + Qu~) =v* —Qu*+ © V~. Hence, by statement (c), we know that A 
takes the element x of V+ to an element of V. This contradicts rot(k) < y2. 
Now W = V and, from statement (b), we have (J — Q)a € V, which implies that 
a <= V because I — Q leaves both V and V~ invariant and is non-singular when 
restricted to V~ . Hence axis(g) = W+a=V+a=V = axis(h). a 


6. A CRYSTALLOGRAPHIC RESTRICTION AND THE PROOF OF BIEBER- 
BACH’S THEOREM. Theorem 4 is an analog in R” of the Crystallographic 
Restriction discussed in Section 4. In particular, if X is a crystal, then Theorems 2 
and 4 eliminate the possibility of X possessing a k-fold rotation about a codimen- 
sion two axis if k > 13. Also notice that Bieberbach’s Theorem is an immediate 
corollary of Theorem 4 because the existence of translations in 7 linearly indepen- 
dent directions is guaranteed by Lemma 2. 


Theorem 4. Jf g is any non-identity element of a crystallographic group such that 
rot(g) < 1/2, then g must be a translation. 


Proof: By way of contradiction, assume that g is not a translation. Let 6 and c be 
as in Lemma 4, and let « = min(6, c/(4|trans(g)|)). Consider the set C consisting 
of all conjugates g = kgk~' of g in G such that 


(1) g#g_— and (2) rot(k) <e. 


The set C is not empty for the following reason. Since g is not a translation, 

axis(g) # R”. Lemma 2, with point p on axis(g), then guarantees the existence of 

an isometry k € G such that rot(k) < e and the translational part of k does not 

lie in axis(g). The latter condition implies that axis(g) and axis(g) are distinct 

because, by Lemma 3, axis(g) = k(axis(g)). Since axis(Z) # axis(g), also g # g. 
‘Let 


d= inf d(g,g) >c > 0, 
ZEC 
the inequalities resulting directly from Lemma 4. The contradiction that will finish 


the proof is the existence of a gy © C such that d(g, g,) < d. Let h © C be such 
that 


d(g,h) < 7d. 
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Then gy = Agh™' is such an element. It remains to show only that g, € C and that 
d(g, Z9) < d. 

We first show that g, € C. Because h € C, we have h #g and h = kgk™!, 
where rot(k) < ¢ < 6. Therefore gy #g by Lemma 4. To verify the second 
condition in the definition of C, we show that there exists a k © G such that 
gy = kgk~', where rot(k) < «. Since gy =hgh™' = (hg™')g(hg~'!)~', statement 
(4) of Lemma 3 implies that rot(hg™') < 2rot(g)rot(k) < rot(k) < e. Hence take 
k =hg"'. 

To show that d(g, g)) < d, let V = axis(g) and V' = axis(h). Let p € V and 
p' € V’ be closest points on V and V’, respectively. Further, let V denote the 
image of V under the translational part of h, and let p € V be a closest point to 
p on V. If trans(h) = 0 then p = p. Otherwise, since h € C express h = kgk™', 
and let a be the angle between trans(g) and trans(h). By elementary trigonometry 
we have sin(a) < rot(k). Condition (2) in the definition of C and statement (2) of 
Lemma 3 yield 


IA 


Ip — pl < |trans(h)|sin(@) < |trans(g)lrot(k) < qc < 7d 


A 


Ip —p'\ < |p—pl t+ lp—p'l < qd+d(g,h) < 3d. 


If p” is the image of p under the rotational part of 4 then, using statement (3) of 
Lemma 3, 


Ip — p"| < |p — p'|rot(h) < 3drot(g) < 3d. 


But p © axis(g) and p”, being in the image of avxis(g) under h, lies in axis(g,). 
Therefore 


d(g,g)) <|p—p"|<\|lp—pl+ lp—p"|l < qd+ qd =d. C 
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Integration Over Spheres 
and the Divergence Theorem for Balls 


John A. Baker 


The divergence theorem of Gauss asserts that if 9 is a nonempty bounded open 
subset of R"(n > 2) with C' boundary 92 and if F: 0 — R” is a C’' function then 


(*) [div F(x)de = J FCs) -v(s)do(s) 


where p(s) is the “unit outward normal’ to 0 at s € dQ and where integration on 
the right is with respect to the “mn — 1 dimensional surface measure” on 0). 
A major obstacle in its formulation is to make sense of the right side of (*), 
i.e., to suitably define v and integration over 01). 

The special case in which (7 is a ball (and 01 a sphere) is of considerable 
interest because, for example, many fundamental properties of harmonic functions 
(including the mean value property, the weak maximum principle and the Poisson 
integral formula) follow from it in a relatively painless fashion; see, e.g., Chapter 1 
of [1] and Chapter 2 of [4]. Even this case, when included in advanced calculus 
texts (usually with n = 2 or 3), is either dependent on a sophisticated and lengthy 
discussion of Stokes’ theorem or on a definition of surface integral that involves 
improper integrals (in dimension n — 1), which are usually inadequately treated. 
Moreover the rotational (or isometric) invariance of integration over a sphere—in 
a sense (see the concluding remark) its characteristic property—is rarely men- 
tioned in calculus texts. 

The theory of harmonic functions in two real variables is addressed in many 
introductory analysis texts, presumably because it is virtually synonymous with 
analytic function theory (in one complex dimension). Such books are rarely 
concerned with harmonic functions of three or more variables, perhaps due to the 
lack of a suitably succinct, but sufficiently profound, version of the divergence 
theorem. 

The aim of this article is to develop a utility-grade theory of integration over 
spheres and use it to formulate and prove the divergence theorem for balls in R”. 
Actually, we carry out this programme in detail only in case . = B"—the closed 
ball in R” centered at the origin with radius 1—and where we denote 01 by $7! 
in deference to our topological friends. As we observe in §6, the case of an 
arbitrary ball can, without difficulty, be reduced to that of B”. 

Throughout this paper n denotes a fixed but arbitrary natural number. Unless 
otherwise indicated, m > 2. Here is an outline of our development. 


(i) Define, in a simple way, an appropriate “integral” for continuous real 
valued functions on S”~', deduce salient properties thereof, including its 
rotational invariance and a “polar coordinates change of variable” formula 
(Theorem 1) and, with the aid of the gamma function, compute the integral 
of a polynomial over B” and over S”~!. 
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(ii) Use (i) to formulate and prove the divergence theorem for polynomial 
functions over B". 

(iii) Use the higher-dimensional Weierstrass approximation theorem and (ii) to 
prove the divergence theorem for B”. 


1. BACKGROUND AND NOTATION. Let Z denote the integers. For1 <n € Z 
and x =(x,...,x,), y=(y,,---,y,) € R” let x-y=x,y, +++ +x,y, and 
Ix] = (x-x)'”*. For 1<ne€Z let B" ={x © R":|x| <1} and let S$”! = 
{x © R”:|x| = 1}, the boundary of B”. If A CR” let Ay = A \ {0}. For A CR let 
A,={x © A:x > O}. For 1 <n € Zand a, b € Rwith 0 <a < bd, let B"(a, b) = 
{x € R":a < |x| <b}; if a real-valued function f is defined and Riemann 
integrable on B”(a, b) we will sometimes denote the Riemann integral of f there- 
over by 


f(x) de. 
a<|x|<b 
For a given nonempty subset A of R"(n = 1) and a bounded f: A — R (or R”), let 
lf ll4 = sup{lf(x)|: x © A}. We will need the following linear cases of the “change 
of variable theorem”’. 


Proposition 1. If 0 <a<b, f: B"(a,b) — R is Riemann integrable on B"(a, b), 
0 < p ER, and Q iis a real orthogonal n  n matrix then 


(i) DF oesl® a= PY cieoyol! py) dy 
and 
(ii) Fonenl® a= Fy epf 22) ay 


in particular, 


(iii) Jf) dx = Den flevires En Yn) A( Vis +++ Yn) 


ifeé,,...,€, € {1, -]}. 


We will refer to property (ii) as rotational (or isometric) invariance. For most of 
what follows we will need this theorem only in case f: R” — R and f is continuous 
on R}. 


2. THE JOY OF INTEGRATION ON S”~'. Suppose that n > 2and g:S""' >R 
is continuous. We aim to define an “integral” of g over S"~', which we denote by 
In-1 gdo,,_,, in such a way that when n = 2 (respectively, n = 3) it conjures up arc 
length (respectively, surface area). Given such a g, define g: R” — R by 


g(|x|‘x) for x © Rj 


g(x) = 
() 0 for x = 0. 


Then g is continuous on R34, g(rs) = 8(s) for se S""' and 0<reéR, and 
lgllse-) = Illlan. It follows that g is Riemann integrable on B” (or any other 
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subset of R” having Jordan content). We may therefore legitimately define 


J 840n1 = nf (2) des (1) 


it will sometimes be convenient to use “Leibnizian” notation and write 


[_8(s) do,_4(s) Or fo 80510025 54) doy 1 (515-65 Sn) 
s” gn 


instead of [o.-1 gdo,_, 

One of the main payoffs of this definition is Theorem 1 below. Before discussing 
it we observe some simpler properties of our integral. 
First note that if V, = {,.1 dx, the “n-dimensional volume” of B”, and if 
_1 = fon-1 1 do, _,, the “n — 1 dimensional surface area” of S"~', then 


A 


A 


n 


n-1 = Y,, forn > 2. 


If g: S' > R is continuous then 


27 
I gdo, =f g(cos 6, sin 6) dé 
Ss! 0 


—the arc length integral—since (with the aid of polar coordinates) 


x y 27 rl 
———— _ , a |] d(x, y) = cos 6,cos 6)rdrdé. 
f.s| Fay yt? Tay? (x, y) i Ja ) 
It follows that V, = a7 and A, = 27, as expected. Here are some other 
elementary properties of integration over S”~' that follow easily from the defini- 


tion and Proposition 1. 


Proposition 2. [f g,h: S"~! >R are continuous, a, B&R, and Q is a real 
orthogonal n X n matrix, then 


(i) [i .(28 + Bh) do,_, = af 840-1 + Bf, nde, 
(ii) i) gdo,_, >0 ifg(s) =0 foralls © S""' andg(s,) > 0 
g7-i 


for somes, € S""', 


w-1A,_,, and 


Git) [J gdoa|s J. le(s)ldoy0(s) <lg 


(iv) « / g(s) do,_,(s) = i) g(sQ) do,,_,(s)— “rotational invanriance”’. 
gol gol 


The following assertion is a useful generalization of the 2-dimensional “polar 
coordinate change of variable theorem’. 


Theorem 1. Suppose 0 < a < band f: B"(a, b) > R is continuous. Then 


J 


a<|x|<b 


f(x) d = rf FO) do,,_,(s)| dr. (2) 
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Moreover, if 


e(r)=f f(x) de fora<rsb, 


a<\|xl<r 


then is continuously differentiable and 
g(r) =r"! / f(rs) do,_,(s) forasr<b, 
gr7-l 


Proof: Assume for the moment that a > 0 and let e€ > 0. Choose 6> 0 such 
that | f(x) —f (y)| < © whenever x, y € B"(a, b) and |x — y| < 6. Suppose that 
a<r<rt+h<bandh < 6. Then 


J f(x) dx - f 


r<|x|l<rth 


Perel x) a 


r<\|xls 


/ { f(x) —f(rlal*x)} de 


r<|x|<rt+h 


<e((r+h)'-r")V, 


since |x — rlx| x] =|x]| -r<h <6 ifr <|x|<rt+h. But 


J orcres Frat x) de = if <|xl<r+h Fret «) dx - Joie, erlel x) dx 
= (r + ny ye tbl y) dy _~ feet Bly) dy 
Rane 


J, LC) doy-1(s). 


By the last two observations and the definition of g, fora<r<rt+h <b, 


go(r +h) - ¢(r) (r +h) (r+h)" - 
a en (rs) do, _,(s)| <« i 
and hence 
r+h) —-—o(r 
PO mt fps) doy als) 
(r+h)" —r" (r+h)" —r" 1 
< eb — TT, | or n—1(5)}. 


It follows that ¢ is differentiable on [a, b] and 
g(r) =r"! / S (rs) do,_,(s) foras<r<b. 
sn 


The uniform continuity of f implies that g’ is continuous. By the Fundamental 
Theorem of Calculus, 


g b b 
[fle de = 9(b) - (a) = fo(r) dr= fire f f(s) do, _4(s) dr. 
a<|x|<b a a gn 
Let a tend to zero to complete the proof. | 


Corollary 1. Suppose that f: R" — R, fis continuous, and f is homogeneous of degree 
p > 0, i.e., f(x) = r°f(x) whenever x € R” andr > 0. Then 


] 
J, f(x) de = ——— nC). 
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Proof: By Theorem 1, 


mic dx = fir (L. fos) do,_.(s)} dr 
1 
ptn 


= [freer ar\ ffs) do,,_,(s) = JS) do,_,(s). @ 


Remarks. The author suspects that (1) is folkloric, and likely has appeared as a 
theorem rather than a definition. In exercise 6 on page 175 of [7], Rudin outlines 
measure-theoretic generalizations of these ideas. 

If for a continuous g: S""' — R we define 


xle(lx|\"'x) if0 #x © R" 
gt(x) = {le(hT x) | 
0 if x = 0 € R” 


then g* is continuous on R” and homogeneous of degree 1; hence, by the 
corollary, 


[809 do,a(s) = (n+ Df g*(x) de. 
5 B 


This could have served as a definition of 


J 84%-1 


thereby allowing the definition of integration over S"~' to depend only on the 
Riemann integrability of a continuous function on B" (our g* is continuous on R” 
but g is discontinuous at 0 unless g = 0). 

Suppose g: S”"' — R is continuous and f is a continuous extension of g to R”. 
For 0 <r, <r, it is natural to call 


— f 
V(r? — rt) or, <ixl<r, 


the average value of f on B"(r,, r,); denote it by av(f,r,,r,). It is also natural to 
call 


f(x) de 


1 
A,-; gnnl 


g do, -1 
the average of g on S”"~'. It follows from Theorem 1 that 


Ant, [  gdo,; = lim av(f,ry,r). 
gr} ry<ry 
ry—-r,70 
In fact the plausibility of this and of Theorem 1 supplied the intuition for our 
definition. 


3. AN INTEGRATION FORMULA INVOLVING THE GAMMA FUNCTION. It 
will be crucial to integrate polynomials over B” and over S"~'. 
Given a = (a,,...,a,) € Z" (such an a@ is called a multi-index and |a| = 


ui-1 a; is its order), define 
P(x) = xP + xo for x = (x,,...,x,) ER" 


(with the convention 0° = 1). Such a p, will be called a monomial. Every 
polynomial (function) from R” to R is a finite linear combination of monomials. 
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Suppose a = (a,,..., a,) € Z". Since p, is continuous and homogeneous of 
degree |a|, Corollary 1 ensures that, 


J Pals) day(s) = (lal +n) f pax) de (3) 


Moreover, by (ii) of Proposition 1, if B =(£B,,...,B,) is a permutation of 
(a,,..., @,) then 


Jf Palx) de = J pa(x) dx, 
and hence 
J. ,Po do,_, = J Pe do, _1- 


If one of a,,..., a, 1S odd we claim that 


n 


f pa(x)de=0 and ff p,(s) do,1(s) = 0. 


For example, if a, is odd then by (iii) of Proposition 1, 
J pax) de = fo (mx) $2 xd. tn) = — fax) ar, 
B” B" Br 
so that 
J pax) dx = 0 
B" 
and therefore, by (3), 
J Pals) do,-1(s) = 0. 


The task of integrating a polynomial over B” (or S”~') is therefore reduced to 
that of finding 


J Pal) dx 


in case a = (2B,,...,28,) with B,,..., 8B, € Z". 

As it turns out, we can do a bit better with the help of Euler’s gamma function. 
We will use the well-known fact from advanced calculus (see eg. [3], page 294 or 
[5], page 484) that 


[ipa n'a = Ae for—1<A,weER (4) 

where I" is Euler’s gamma function. Recall that 
T(x+1) =xP(x)>0 = for0 <x ER, and (5) 
T'(k) =(k — 1)! for0 <k &Z. (6) 


For 0 < y,,..., y, © R,, define 


L(Y1>-+>%) = [ ()” + (x2)" d(X1,..-,Xn)3 
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by Corollary 1 


L109 %) = (2¥, + +2y, + nyo | (s?)” oe (s?)" do, 1(S15+++55,)- 


If A, weER, then 


BOA, mw) = f(27)"(9?)" diy) = 4f f Vinx? 2ay2u “y| dx 


2put+1 
2 
1 (1 — x*) 4 1 1 dt 
=4[ x?A*A~__~____ dy = t(1 — 1) ¢'/? —~ 
J ~ Opt 2p + v, 9) 2vt 
2 1 
_ tA-1/2(. ~ ¢ a+1/2 dt 
2u+ iJ, ( ) 
1 (A +1/2)T(» + 3/2) 
= (by (4)) 
wt 1/2 P(A + wt 2) 
T(A + 1/2)0(» + 1/2) 
=o? (by (5)). 
T(A + w+ 2) 
Thus we have 
Py, + 1/2)0(y2 + 1/2) 
L(%1, ¥2) = for Vir ¥2 © Ry. (7) 


P(y, + y2 + 2) 


Now suppose that 1 > 3 and y,,..., y, © R,. By integrating first with respect 
to x,, using Theorem 1 (with 1 replaced by n — 1), and changing variables we 
surmize that 


T,CV19+++9 Me) 


2y¥,+1 


1 n-l ft ~ xt + ae a ; 
=2f (xt! (x2_,)’ Be asec ty 


1 4-2 2yvMt Ft Yay 2yN Yn-1 
akon ren te 


x(1 - pry rrr“? do,-2(5) dr 


2 - Lan 
_ 5 7 [fory +y,-, —1+ “a _ prymr? ar 
4 Vn 0 
v1 Yn-1 
x f(s)" (shay doy 268) 
= 2 [lot tre te 2 _ymre 
2Y, + 1]/o wet 


xf (92) (821) day_o(s). 
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But, according to the Corollary 1, 


J (st)” on (s2_,)"" do, (5) 


= (2y, to +2y,-, +2 -— II,-1(%-- +> M1) 
and, by (4), 


fio ~+y,~,—3/2+n /2(4 _ ryyri7e dt 
0 


P(yy te ty—1 — 1/2 +/2)0(y, + 3/2) 
- V(y, t+ +y +14+27/2) 


Thus, 
TC Y15+++> Yn) 
Lo (yy to tar — 1/2 + n/2)T(m + 372) 
— (2y, +1) V(y, to +y, + 1427/2) 


(2y, to +2y,-1 tn — WIT - ++ M1) 
Ot ta 71/2 + 0/2) +o ta ~ 1/2 + 0/2) 
(iy, + +y, +1+7n/2) 

I(y, + 3/2) 
~(% + 1/2) 
Poy, to ty + 1/2 + n/2)T(y, + 1/2) 

Ty tee ty tl tn/ZQ 


*Th-1( M19 ++ +> Ma-1) 


T,-1("1>- 789 Yn —1) 


where the punch-line needed (5). 
It follows by induction, with the aid of (4) and (7), that 


P(y, + 1/2) -* PCy, + 1/72) 


I weey Sn 
iY %) C(y, +0 +y, + 14+7/2) 


forn >2and y,,...,y, © R,. 
(8) 


4. THE DIVERGENCE THEOREM FOR B’; A PROOF FOR POLYNOMIALS. 
A function f: B"” > R is said to be C' (continuously differentiable) on B” 
provided there is a A >O and a continuously differentiable function 
v:{ x © R": |x| < 1+ A} > R such that f(x) = y(x) for all x € B"; in this case 
we define 0, f(x) := 0; v(x) for x © B” and 1 <i <n (where 0, = 0/0x;). Given 
such A and y, choose a continuously differentiable function g: [0, +) > R such 
that 


2A 
p(t) =1 whenO<t<1+A/3 and g(t) = 0 when 1+ ster. 


Then one may (without embarrasment) define fy: R” — R by decreeing that 
e(ixlx(x) if |xi<1+A 

0 if |jxi>1+(2/3)A 

and conclude that f, is C' on R” and f,(x) = f(x) for x € B”. In summary, a C! 


function from B” to R is the restriction to B” of a C! function from R” to R (with 
compact support if desired). 


fo(x) = 
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Suppose that F: B” — R", say 


F(x) = (f,(*),...,f,(%)) — for x © B”. 
We say that f is C' on B” provided every f; is C' on B”. In this case define 


divF(x) = Li of(x) for x © B’; 
i=1 


the function div F is called the divergence of F. 
The divergence theorem for B” asserts that if F: B” > R” is C' then 


jf divF (2) dx = JPG) -sdo,_,(s). 


With the notation of the preceding paragraph, this is equivalent to asserting that 


[ ofa) a= f FC Sy5- ++) S_)S, AO, 1(815--255,), IL sisn. 
B" snot 


It therefore suffices to prove. that, for any C' function f: B” > R, 


[ af(~) a= f F(51,--+5 5, ) 58; dOG,_-1(515---55,), I sisn. 
B" snot 


The rotation invariance of integration (over B” and over S”~') further implies that 
one 7 is as good as another, that is, the divergence theorem for B” is equivalent to 


Proposition 3. For every C' function f: R" > R, 


J af(x) d& = | FI CStr 6+) 2) Sn Fy 1 (Sty +++ Sn): (9) 
B" Ss” 


Our strategy of proof is to apply the Weierstrass approximation theorem to 
reduce consideration to the case in which f is a polynomial function. The 
polynomial case easily reduces to the monomial case by linearity (of d,, the dot 
product, and integration over both B” and S"~'). 

Suppose that (a@,,...,a@,) € Z"%, and f(x) = x@ + x for all 
x = (x,,...,x,) € R". 


Case 1. a, = 0: In this case, 0, f = 0 and, by the “oddness” observation of §3, 


J SCS 5) Se do,,_,(51,--+55,) = fost sso gnats do,_1(S1,---)5_) = 0. 


Case 2. One of a,,..., a@,_, is odd, or, a, > 0 and a, is even: In this case, 


rd 


J anf 2) de = ay fargo xpeyiaget d(xyy..5 4) = 0 
B" B" 


because one of the exponents is odd (see §3), and, for the same reason, 


eo Si )Sp_ AG, 1(S15+++5 Sn) 


= Bi oes An a,+1 = 
= Js" sonaisen”” do,(S,,-.-,5,) = 0. 
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Case 3. a,,..., @,_, are even and a, is odd: Choose B,,..., 8, € Z, such that 
a, = 2B, forl <k<n-—1land a, =2B, + 1. Then 


J f(x) de = (2B, + 1) fo xPPr x78 d( ay, .-.5 ¥n) 
B" B" 


P( B, + 1/2): TC B, + 1/2) 
r(B, ++ +B, +1+7/2) 


according to (3) and (8). By (3), (8), and (5), 


eo Sn) Sq AG, 1(Sy,+++5 Sn) 


= (28, +1) 


—_— 2B, ... 2Bn-1 2B,+2 
= Js Loses gent ge do,,-,(S15-++> Sn) 


= (28, +°°+2B8,-,+2B8,+2+n)1,(8B),.--,B, +) 
=2(68,+°:+B,+1+4+7/2) 
ye (By + 1/2) + TC Ba-1 + 1/2)T CB, + 3/2) 
r( B, +++ +B, +2 +n/2) 


- T( B, +++ +B, + 1+n/2) 


=| 4,f(x) a. 
B" 
We have verified Proposition 3 in case f is a polynomial function. 


5. COMPLETING THE PROOF WITH THE HELP OF WEIERSTRASS. 
We will use the following variant of the 


Weierstrass Approximation Theorem. If K is a nonempty compact subset of R", 
n>1, f: K->R, f is continuous, and e> 0, then there exists a polynomial 
p: R" > R such that 


f(x) —p(x)|<e — forall x € XK. 


This theorem can be proved in many ways. The Stone-Weierstrass Theorem 
(see, e.g., [8], page 210) yields it almost immediately and the celebrated method of 
Bernstein can be adapted to higher dimensions (see, e.g., [2], page 122). In fact 
Weierstrass’ proof (see Chapter 59 of [6]) can also be modified, in a natural way, to 
apply when n > 1. 


Completion of the Proof of Proposition 3. Suppose that f: B” — R, f is C’ and 
e > 0. Assume, without loss of generality, that f is in fact defined and C' on R". 
For x = (x,,..., X,) © R’, 


f(x) = f(O) + (f(415-++5 Xa-1,0) — f(0)) (10) 
+ (f(%1,---5 Xn) — f( 1, +++, X_-1,9)) 


= f(0) + (f(<,)-+5 X__1,0) — f(0)) + [Afr Sn t) t 
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Choose polynomials p,: R”~' — R and p,: R" — R such that 


| f( 415-665 X_-1,9) ~ f(0) = Pol X15+++5X_-1)| <eé/] (11) 
and 


Of (X15-++>%X_) —Pi(%1,--->X,)| < €/7 whenever (x,,...,%,) € B”. (12) 
Now, for x,,...,x, © R, define 


P(%1,.-+, Xp) = f(0) + Po(X1,-++5%X,-1) +f PC tev) dt (13) 


and note that p is a polynomial. By (10)—(13), for all x = (x,,...,x,) € B”, 


| f(x) — p(x)| <eé/] | PAA esti — p\(X1,---,X,-1,¢) dt 


<€/7 + |x,|€/7 
so that 
lf(x) - p(x)| <e. (14) 
By (13), 0, p = p,; hence, by (12), 
\a,f(x) -— 6,p(x)|<e/7 — forall x eB’. (15) 
From §4, 


J 4, p(x) dx = [ © PCStr +++ Sa) Sn Ey (S19 0+ + Sn): 
B" s" 

Thus, by (14) and (15), 

if A, f(x) dx — f  FCS1, ++ +9 Sp )Spq AO, 1 (51, +++ 5 Sy) 

B" gn 1 


s|f af ae - fap) as| 


+|f P(S1,--+5 Sp )Sy doy, -1(515-++55n) — f f(S1,-+-+, S,)S, Ad, _1(S,,--+,5,) 
gr-l gr-l 


< (€/7)V, + €A,_). 


Since € was chosen arbitrarily, the proof of Proposition 3 is complete. | 
The divergence theorem for B” has therefore been established. 


6. FURTHER REMARKS. 


1. The divergence theorem can be extended to arbitrary balls in R” by 
appropriately defining integration over arbitrary spheres. This can be done 
as follows. 

Given p€&R" and p>O, let B(p, p)={x ER": |x — pl < p} and 
S(Cp, p) = {x © R": |x — p| = p}. For a continuous g: S(p, p) > R define 


J 


gdo,_, = pr i) g(p + ps) do, _,(s) 
(p, p) gro} 


= np” } i, g(p + p\x|'x) dx. 


n 
0 
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2. By (8), 
r(1/2)" 2 1(1/2)" 
" T(+n/2)  n F(n/2) 
In particular, 7 = V, = I'(1/2)’; ie., 
T(1/2) = Vz. 


for n > 2. 


Hence 
9) aT n/2 


Yn n T(n/2) 
3. Riemann integration and Jordan content can be defined in the S”~' setting 
as follows. ’ 
Given A ¢ S"~', call A contented provided A = {rs:s € A,0 <r < 1} has 
Jordan content and in this case define g,_,(A) =np,(A), where pw, de- 
notes Jordan content in R". If g: S"~' > R, let us say that g is (Riemann) 
integrable on A provided g is Riemann integrable on A, in which case 
decree that 


for n> 2. 


| J go, = nf (ix 'x) dx. 


4. More ambitiously, one can define 


[84-1 = mf, allel x) de 


for any Borel measurable g: S”~' > R, (this is the gist of exercise 6, 
Chapter 8 of [7]) and thereby construct a rotation-invariant regular Borel 
measure on S”~'. This can also be accomplished by applying the Riesz 
Representation Theorem (see [8], page 352) to extend our humble integral. 
In fact, a theorem of Banach (see [9], pp. 314-319, or [8], pp. 361-370) 
implies that there is a unique regular Borel probability measure on S"~! 
that is rotation invariant. 
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Math Lingo vs. Plain English: 
Double Entendre 


Reuben Hersh 


Once upon a time, when I was a teaching assistant, teaching a class of the kind 
mockingly called “Math for Poets,” an obnoxious freshman said to me, “Zero isn’t 
a number.” 

I have forgotten my answer, but I remember finding her remark a shocking 
expression of profound ignorance. 

Years later, it dawned on me—she was right! 

If I say “I own a number of calculus books” or “I have a number of friends at 
the Courant Institute,” I don’t mean zero books or zero friends. I don’t even mean 
one book or one friend. I mean two or more. That’s what “number” means in plain 
English. I read recently that the famous phenomenologist Edmund Husser! meant 
by “number” something greater or equal to 2. So did Plato. 

In mathematical talk, “number” has several meanings. None is the plain English 
meaning. The ordinary math teacher, like me back then, is so deeply embedded in 
math lingo that he /she doesn’t notice the inconsistency. But the inconsistency can 
confuse students. 

I say “math lingo,” not language. It’s a jargon, a semidialect of English (or some 
other natural language), not a complete language. You can’t say “I have a 
headache” or “You bore me” in math lingo. 

In math lingo, a straight line is the simplest example of a curve. In plain 
English, quite otherwise: a straight line isn’t a curve, and a curve isn’t a straight 
line. 

In English, what we call a “line segment” is just a “line.” What we call a “line” is 
“an infinite line.” “Difference,” “product,” “factor,” “prime” all have different 
meanings in plain English and in math lingo. I may ask a student, “If you subtract 
zero from zero, what’s the difference?” While answering math-linguistically, “zero,” 
‘ she may be thinking, plain-Englishly, “That’s right! Who cares? What’s the differ- 
ence?” 

In English, “adding” increases what you’ve got. In math lingo, it may increase it, 
decrease it or neither, depending on whether you happen to be adding something 
positive, negative or zero. 

Correspondingly, subtracting decreases. In math lingo, it may decrease or 
increase or neither. 

In English, “adding” and “subtracting” are opposite. In math lingo, they’re 
opposite, and yet they’re the same! For adding a number is the same as subtracting 
some other number (its negative). 

In English, “multiplying” means repeated adding. It makes things bigger. In 
math lingo, multiplying makes them bigger, smaller, or neither, depending on what 
you multiply with. 

Correspondingly, “divide” means cut into pieces, possibly equal pieces. In math 
lingo, “divide” is the same as “multiply,” in the sense that dividing by a number 
other than zero is the same as multiplying by some other number (its reciprocal). 
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There’s a familiar conundrum about amoebas: amoebas multiply by dividing. To 
untangle this nonsensical but correct statement, you must see the difference 
between the mathematical and the plain English meanings of “multiply” and 
“divide.” 

What should you do about all this? Be aware of it and point it out to students. 
By appropriate examples, make them realize that what they hear in class or read in 
the text is technical jargon, not plain English. Otherwise, when they try to 
remember what you said in yesterday’s lecture, they may remember it with the 
wrong meaning (the plain English). 

Anneli Lax reminded me of one of the commonest linguistic pitfalls: the little 
one-letter word “a.”’ Her example is “Show that a number divisible by 6 is even.” 

No seasoned math teacher is surprised to receive the wrong answer, “42 is 
divisible by 6. 42 is even.” Why is this answer wrong? 42 is divisible by 6, and 42 is 
even. What’s wrong is that the question has been misunderstood. By “a,” the 
questioner meant “every”; the student misinterpreted it as “some.” This is a 
quantification problem, which in principle could be cured by using symbolic logic 
instead of English. But in a case like this, something deeper is wrong. The student 
should realize that with the interpretation “some,” the question is too trivial to be 
on the test. Grounding in the context saves the student from most verbal pitfalls. 
One goal of teaching is to ground the student in the context. Linguistic ambiguities 
can hurt. 

In logic, the pitfalls of “or” and “implies” are familiar. 

Take “or.” In plain English, “Tea or coffee?” means one or the other, not both. 
It’s called the “exclusive or.” 

“Are you coming or going?” 

“Was that your husband or your boy friend?” 

“Do it now or later?” 

All are exclusive. It’s hard to think of a colloquial example of the other “or,” the 
inclusive one. A reasonable example might be, “Like a hug or a kiss?” 

In logic, “or’’ is inclusive by convention. “A or B’” is true if A or B or both is the 
case. I think it’s customary to explain on the first day of elementary logic class that 
logicians have decreed “or” to be inclusive. A student can accept that logicians felt 
they had to pick one or the other. Perhaps they had a reason for picking the 
inclusive. 

Peter Lax tells about the famous logician Abraham Fraenkel, of German origin 
and Israeli residence. Once in Jerusalem or Tel Aviv he was on a bus scheduled to 
leave the station at 9A.M. At 9:05 the bus was still sitting in the station. Fraenkel 
waved a bus schedule at the bus driver, who asked, ‘What are you, a German or a 
professor?” Fraenkel inquired in return, “Do you use the inclusive ‘or’ or the 
exclusive?” 

“Implies” is worse. In plain English, “A implies B” means that if A is true, B 
must be true. If A is false, the “implies” statement is vacuous, neither true nor 
false. 

But in logic, the “law of the excluded middle” insists that every statement be 
either true or false. The statement “A implies B” has to be either true or false, 
even if A is false. Logicians chose “true.” So in logic, if A is false, then A implies 
B, whatever B may be. This is so unintuitive, I say logicians should have used 
another word, even made up a word. It’s too late for that. But the student is told 
that “implies” in logic is different from “implies” in plain English. In pre-calculus, 
calculus, and post-calculus, we should be equally considerate to warn of linguistic 
traps. 
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I have just carelessly used “equally.” “Equal” is used freely, from kindergarten 
to postgraduate. It’s never defined or explained. 

In plain English, its meaning varies. Sometimes it’s “identical, indistinguishable.’ 
Sometimes it’s “worth the same number of dollars.” Or “just as good” for some 
purpose. 

Math lingo sometimes says “equal,” sometimes “equivalent,” the latter if an 
equivalence relation has been defined. Then we explain that an equivalence 
relation is Reflexive, Symmetric, and Transitive; it defines a partition on a set. 

But what does equal mean? When we say 1/2 = 2/4, we don’t mean 1/2 is 
indistinguishable from 2/4. They have different numerators. They have different 
denominators. We regard them as equivalent for good and sufficient reasons. All 
this may be explained in an advanced course, on the rare occasion when a detailed 
construction of the rationals is carried out. But already in the fourth grade the = 
relation is an equivalence relation between fractions, not an identity. No one ever 
explains this, so there’s no way for the student to understand = , except in terms 
of models like slices of apple pie. 

This nonunderstanding was manifested frighteningly when a calculus student 
was asked, “What is the minimum of the function 


? 


y =x? + 2x 4+ 52" 
and answered “correctly” 


“4° + 2x4+5=2x%4+2= -1=4 minimum” 


Maybe this is the outcome of years in high school spent factoring, multiplying, and 
dividing expressions that always remained equal. 

In plain English, set and group are synonyms. When we teach groups, we define 
set and group, then charge ahead. But some students wonder, “What’s the 
difference? A group is the same as a set.” Mention this plain English equivalence, 
and state explicitly that in math these words have different meanings. 

The same is true of sequence and series. Their plain English meanings are the 
same—what in math lingo we call “a finite list.” “Series” is more colloquial than 
sequence—for example, it’s the World Series, not the World Sequence! Here the 
danger of confusion is more serious than with set and group. The mathematical 
meanings of sequence and series are so close that the distinction between them is 
crucial. In teaching series, we should acknowledge that we’re giving a new meaning 
to a common word: putting + signs instead of commas between the terms. 

The first day of first-semester calculus I like to talk about driving to Santa Fe. 
Distance from Albuquerque is a function of time. Speed is another function of 
time. But what is “function” in English? If you ask, “Of what is the speed a 
function?,” you’re told, “It’s a function of how much gas you give” or “Of how hard 
you push the accelerator pedal.” “Function” in English (apart from the irrelevant 
reference to weddings and Bar Mitzvahs) involves causal dependence. “How fast 
you learn is a function of how hard you study,” for example. How can anything be a 
function of time? But the students swallow that. They understand a graph with a 
time axis. Then I say, “Distance is a monotonic increasing function of time, so the 
inverse function exists. Time is a function of distance.” How can time, the 
independent, uncaused variable, be caused by distance? We try to teach our 
technical meaning of “function” without noticing the meaning the student brings 
into class. 

We’re aware that “limit” and “converge” are deep concepts. We sweat over 
them. But we don’t acknowledge the complication caused by plain English. A 
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“limit” in English is a barrier, a boundary beyond which one may not pass. This 
may partly explain why students want to approach a limit only from one side, not 
in alternating fashion. As for “converge.” In practical computation, an algorithm 
converges when it settles down to one value and stays there—stays till whoever’s 
doing the calculation is satisfied. That’s the English of converge—“settle down” 
“close to” some “limit.” In teaching our uncomputational, abstract meaning of 
“converge,” we should talk about the colloquial meaning and explain the differ- 
ence. 

In advanced mathematics, there’s more linguistic confusion. Surds (absurd), 
irrational and imaginary numbers, singular perturbations, degenerate kernels, 
strange attractors—all sound dangerous, undesirable, things to avoid. Yet a 
degenerate kernel or a singular perturbation may be more useful than a non- 
degenerate or regular one. 

We also talk about “function spaces.” The points in a function space are 
functions. But a function is a graph—a curve. How can a curve be a point? A 
point, which has no parts! We don’t acknowledge the change of meaning. Just give 
a definition and two examples, then charge ahead. 

An example of the opposite kind (due to Peter Lax) is “simple curve.” Draw a 
confusing tangle that doesn’t intersect itself. Its complicated. We say it’s simple. 

What about “partial?” A partial order isn’t a special kind of order. A partial 
differential equation isn’t part of an ordinary differential equation. And an 
ordinary differential equation may well be extraordinary. 


Exercise: (a) give the plain English meaning of prime; differentiate; integrate. 

(b) check your answers against a standard dictionary. 

(c) make up three slogans, one using each of these three words, that could 
appear on picket signs at a demonstration. 

It’s fortunate that some double meanings are so far apart they can be used fora 
joke. A manifold is part of an automobile engine (I think), and a commutator is 
part of a direct current electric motor. 


ACKNOWLEDGMENTS. Veronka John-Steiner, Anneli Lax, and Peter Lax gave suggestions and 
encouragement. [1] is an inspiring example of frank talk about college math teaching. 
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NOTES 


Edited by Jimmie D. Lawson 


Three Secrets About Harmonic Functions 


R. B. Burckel 


1. INTRODUCTION AND NOTATION. III not be coy, but reveal these to the 
reader at once. Actually, they’re not secrets to practitioners, but only to the 
broader mathematical public. Since they are striking results and accessible by 
elementary means, perhaps their revelation will be welcomed. They say roughly: 


1. There exists a simply connected plane region for which the Dirichlet 
problem is not solvable. 

2. The areal mean-value property of harmonic functions characterizes disks. 

3. Harmonicity of the function that measures the distance to the boundary of a 
region characterizes half-planes. 


In the following sections I will give the precise statements and proofs, but first 
some necessary notation. Let D(a, r) := {z € C:|z —a| <r} fora € C,r> 0 and 
D := D(0, 1). H denotes the open right half-plane {z € C: Re z > 0}. Fora,be C 
let [a,b] denote the interval {(1 — t)a + th: 0<t< 1}. C, is the one-point 
compactification of C (a.k.a. the Riemann sphere). A region is a non-void open, 
connected subset 2 of C. We write 90 :-= \, where bar denotes closure 
in C. If Q is unbounded, we write ¢,.Q := {~} U aM. Of the myriad equivalent 
notions of simple connectivity in C we adopt this one: (2 simply connected means 
that C,,\ is connected, or what is the same thing, C\Q has no bounded 
component. 


2. A COUNTEREXAMPLE. We say that the Dirichlet problem is solvable in Q. if 
every continuous f: dQ —> R admits a continuous extension F: 2 > R such that F 
is harmonic in (2. If f is bounded, the Perron-Kellogg-Wiener construction 
produces a harmonic function F:  — R that satisfies 


lim F(z) = f(2) 


zEN 


for every x € 9() at which (2) possesses a barrier. Every simply connected region in 
C has a,barrier at every one of its boundary points. (For these facts see [4].) The 
continuous function f is certainly bounded if 90 is bounded, or if 91 is 
unbounded and f is actually continuous on the (compact) set 4,0, i.e., 
lim f(z) exists in R. 


27s 


Q 
"In view of all this, Theorem 1 might come as a shock. As far as I know, such 
examples are only of recent origin; but see [3]. Specifically, this one originated in 
[6]. After the author presented a somewhat simplified version to a seminar, 
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Sadahiro Saeki introduced further simplifications. Here is his version 


Theorem 1. There exists a simply connected plane region for which the Dirichlet 
problem is not solvable. 


Proof: For each integer n > 3 define the open sector 


S 


n 


= [re O<r<a”, 


9 7 7 
— —| << ————__ }. 
| 2n(n + 1) 


Since the restriction on @ here is 


1 1 I 
—_ + ——_— 
{+ an 0 2(n + 1) 
n+1 7 n ‘ 
it is obvious that 
S,, OS, = {0} whenever m > n > 3. (1) 


Define 


Q := D(0,2) U U S,. 


n>3 


This set is open and is starlike with respect to 0; hence it is connected. Hence 
too C \ C2 is a union of half-lines, so C \ O has no bounded component, making 22 
simply connected. Notice that 9 consists of [2,+ ~[, together with arcs of 
dD(0, 2), together with the disjoint sets (0S,,) \ D(O, 2) for n = 3. Now it is clear 
that z > (ze'7/")""*) maps S$, into H and the boundary rays of S, into iR. 
From this and (2) it is easy to see that a function f: dQ — R is well defined by 


nme) for z € (0S,) \ D(0,2), n = 3 


for all other z € 0D 


f(z) = Re|(ze~!*/") 
0 


and is continuous. Suppose / is a continuous real-valued extension of f to Q, that 
is harmonic in 2. On the compact subset D(0,2) c QO, h is bounded below, say by 
c €] — ~,0[. For each integer n > 3 


h(z) - Re|(zei7/"y"""? >C for all z € 0S,. (2) 


This is because for z € (0S,) \ D(O,2) c 9, the left-hand side of (2) is h(z) — 
f(z) =0>¢, while for z € (0S,) N D(O, 2) it is h(z) — 0 = c, by definition of c. 
By the Maximum Principle [4, p. 253], inequality (2) must hold as well at 
z=2e'7/" ES, yielding 


h(2e'7/") — 2°" *) se, (3) 


The validity of (3) for all n > 3 makes h unbounded on the compact subset D(0, 2) 
of (. This contradiction shows that no such function h exists. = 


3. A CHARACTERIZATION OF DISKS. The second secret is surprising because 
it’s an analytic characterization of a geometric property. Recall the areal mean-value 
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property of harmonic functions: 


(AM) U open Cc C, D(w, R) C U,h: U > R harmonic > 


1 
h(w) = TR oe. gt dz), 


where A is two-dimensional Lebesgue (area) measure. Using Lebesgue’s Monotone 
Convergence Theorem we can slightly extend the scope of (AM): its conclusion 
holds whenever 


D(w,R)CUand [ —_|h(z)ldA(z) < +. 
DW, R) 
In this form there is an unexpected converse: 


Theorem 2. Suppose U open C C has finite area, z, © U, and 


1 
(*) (20) = Sopp Jt) AL) 


for every h that is harmonic in and (absolutely) integrable over U. Then U is a disk 
with center Zp. 


Proof: Let r be the radius of the largest open disk, call it D, centered at z,, which 
lies in U. This number is finite and positive and there exists (compactness) some 
z, © C\U with |z, — z)| =r. We will show that D = U. To this end, define the 
function h on U by 

zl” — 


\z — 
h(z) = ——~—— 


<— +1, zeUu. (4) 
lz — z,| 


Since |z,; — Z)| =r, a little computation reveals that 


h(z) = 2Re{ =}, (5) 


~ 4] 


so h is harmonic in U. To check that h is integrable over U, (5) shows that it 
suffices, since U has finite area and h is bounded outside any neighborhood of z,, 
to show that |z — z,|~* is integrable over D(z,, 1). That follows from passage to 
polar coordinates: 


/ Iz —z,|~' dA(z) = i) Iz|~' dA(z) = [flee dé dp = 27. 

D(z1,1) DO, 1) 0-0 
Now (*) ensures that 

0 =h(z)) = jr dv = ja dd + Snpi™ (6) 
But according to (AM) we also have 
0 =h(z)) = f hda. 
D 

Combined with (6) and the fact h > 1 in U \ D (see (4)), this gives 


0 = Snpi® > A(U\ D) = AMU\D). 
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It follows that the open set U \ D must be empty, that is, Uc D. From DCU 
open CD it follows that U = D. a 


This proof is due to Kuran [7], and represents the final touch of elegance and 
hypothesis-reduction in a long evolution. In fact, this result is part of a large 
subject called quadrature problems that interested readers can find more about in 
[9]. The hypothesis of Theorem 2 can be weakened to require only that (*) hold 
for all nonnegative harmonic functions / that are integrable over U. However, it is 
not sufficient that (*) hold only for bounded harmonic functions h; see [1]. 


4. ANALYTIC CHARACTERIZATION OF HALF-PLANES. The third secret is 
also an analytic characterization of a geometric property. Theorem 3 occurs as an 
exercise in the little book of Fuchs and Schumitsky [5], but in a letter to me 
Professor Fuchs indicated that he is unaware of its provenance. Armitage and 
Kuran [1] investigate similar problems and prove that the function p in (7) is 
subharmonic in 2 if and only if C \ (2 is convex. Using their result, Parker [8] gave 
a proof of Theorem 3 somewhat different from that presented here. The proof that 
follows was helped to birth by conversations with Erich Novak and correspondence 
with Wolfhard Hansen. For a region 1) that is not all of C let 


p(z) = inf{lIz -wl:weC\ QO}, ze, (7) 


be the function “distance to the complement of 1.” It is well known and 
elementary that p is a continuous function on C, even a contractive mapping, 
whose zero-set is C \ Q. Each function z > |z — w| is subharmonic in C, although 
this fact plays no role in the sequel. 


Theorem 3. Jf p is harmonic in ©, then ©. is a half-plane. 


Of course, the converse of this is trivially also true. For the proof of the theorem 
we need an elementary but perhaps not so well known fact about harmonic 
functions. A direct proof is indicated, although it can also be deduced from 
better-known properties of holomorphic functions by symmetry arguments. 


Lemma. Suppose a, b € C, [a,b] C Uopen < C, h: U > R harmonic. If h71(0) N 
[a, b] is infinite, then ha, b] = 0. 


Proof: We may suppose [a,b] CR. Let x, be a limit point of h~'(0) A [a, db]. 
For some r > 0, D(x), r) C U. Harmonic functions are real-analytic, so h(x) = 
h(x + i-0) is represented by a power series in x — x, for x €]x, —r, x, + rl. The 
usual argument that the zeros of a non-trivial power series are isolated [4, p. 78] 
can be applied here to show that 4 = 0 in this whole interval. This shows that the 
closed set of all limit points of h~'(0) /- [a, b] is also relatively open in [a, b]. As 
this set is non-empty by hypothesis, it must be all of the connected set [a, b]. | 


Proof of Theorem 3. By definition of p 
Q=p'(j0,+-[) and D(z,p(z))cQ  forallzeQ. (8) 


Fix x) © 0. By compactness there exists b) € 9M with p(x,) = |x) — by|. After 
translation and rotation of ( we can assume 


by = 0, Xp is real and positive. 
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Then an application of the triangle inequality shows that p(x) = x for all 
x Ele, xg! 


D(%),X%)) CQ — and p(x) =x Wx €]0, xo]. (9) 


From the lemma it follows that p(x) =x for all x €]0,2x,[C D(%p, x9), whence 
by continuity of p on C, p(2x,) = 2x, > 0. It follows from (8) that 


D(2X5,2X%)) CQ — and p(x) =x Wx €j0,2x5]. (10) 
Iterating the argument that led from (9) to (10) shows that 
H= U D(2"x,,2"x,) CQ — and p(x) =x Wx el]0,+ of. (11) 


neEN 
Consequently, for z © H 
p(z) = dist{z,C \ QO} = dist{z,C \ H} = Rez. (12) 


Now (11) and (12) say that p(z) — Rez is harmonic and non-negative in H, but 
vanishes at (many) points of H, so by the Maximum Principle it vanishes identically 
in H: 


p(z)=Rez VWzeHc?. (13) 


Since p is continuous on C, this shows that p(iR) = 0, so by (8), IRN O = ©. 
Being connected, 9 cannot therefore meet both H and C\H, so from (13) it 
follows that 


H=2). 


ACKNOWLEDGMENTS. I thank the referee for spotting some deficiencies in the first draft, and 
David Armitage for some of the history in Section 4. 
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A Short Proof of the Erdos-Mordell Theorem 


Vilmos Komornik 


Dedicated to the memory of P. Erdés (1913-1996) 


The following beautiful inequality was conjectured by Erdés in [1]: 


Theorem. Given a point P inside a triangle ABC, let us denote by R,, R,, R.. its 
distance from the vertices A, B,C and byr,, r,, 1, its distance from the lines of the sides 
a = BC, b = CA and c = AB; see Figure 1. Then 

R,+R, +R, > 2(r, +7 +7,). (1) 
He arrived at his conjecture in an experimental way in 1932, after having drawn 
many triangles. 


Cc 


Figure 1 


The first proofs by Mordell [2], [3] and Barrow [3] were based on trigonometry. 
Later, several elementary proofs were given, using either not too well-known 
results (a theorem of Pappus in [4], a theorem of Ptolemy in [7]) or clever angular 
computations with similar triangles in [5]. In [6] a whole series of related inequali- 
ties was established by elementary means; for the proof of (1), however, a 
nontrivial transformation (using isogonal conjugates) was applied. 

The purpose of this note is to give still another elementary proof. We use only 
basic notions and results taught in secondary (or even elementary) schools. At the 
same time, our proof seems to be shorter and simpler than any previous one. As 
usual, we also show that the inequality in (1) is strict unless the triangle is 
equilateral and P is its center. 
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At the end of this note we recall briefly how to deduce several other well-known 
geometric inequalities from this theorem. More applications are given in [8]. 

1. Consider first a point P lying on the side a = BC (see Figure 2). Then the 
double area of the triangle APC is equal to br,, the double areas of the triangle 
ABP is equal to cr, and hence that of the triangle ABC is br, + cr,. On the other 
hand, R, cannot be shorter than the altitude of ABC drawn from A. Therefore 


aR, = br, + cr.. (2) 
Observe that this inequality remains valid for every point P in the angular domain 


BAC: it suffices to note that inequality (2) is equivalent by similarity to the 
inequality corresponding to the intersection P’ of the side BC with the ray AP. 


LS 


Figure 2 


2. If the triangle is equilateral, then the inequality (1) follows at once. Indeed, it 
is sufficient to divide the inequality (2) by a =b=c and add it to the two 
analogous inequalities R, >r.+r,,R,=>r7r, +7,, obtained by cyclical permuta- 
tion of the indices. 

Moreover, the proof of (2) shows that we have equality in (1) only if P belongs 
to all three altitudes of the triangle, ie., if P is the center of the equilateral 
triangle ABC. 

3. If the triangle is not equilateral, then we need a simple corollary of the 
inequality (2). Applying (2) to the reflection P’ of P on the bisector of the angle 
BAC, we have R’, = R,, r, =r, and r. =r, with obvious notation. Hence 

aR, => br. + cr, 


for any point P in the angular domain BAC. 
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If P is inside the triangle ABC, then we also have DR, >cr,+ ar, and 
cR. = ar, + br, by symmetry. The inequality (1) now follows easily: 


b* + ¢? c* +a’ a’ + b? 


‘Ri +R,+R,> 
3 b c be * ca ab 


r.>2(r, +1, +7,). 


Note that we have strict inequality in the last step because a, b, and c are not all 
equal. | 


Now let us give some applications. In what follows we shall denote by m,, m,, 
m., R, r, and T the altitudes, the circumradius, the inradius, and the area of the 
triangle ABC, respectively. Note that for any point P inside the triangle we have 
clearly 


ar, + br, + cr, = 2T; (3) 
applying this to the incenter we obtain the well-known equality 
(a+b+c)r=2T. (4) 
(a) For any point P inside the triangle we have 
R, +r, =™,; R,+r,=™,, and R,+r. =m. (5) 
Summing them and applying the inequality (1) we find that 
m,+m,+m,<15(R,+R,+R,). (6) 


By continuity, this inequality remains valid if P is on the boundary of the triangle. 
Moreover, it remains valid for all points P in the plane. Indeed, if P is outside of 
the triangle, then the sum R, +R, +R, decreases as we replace P by its 
orthogonal projection on the (closed convex) triangle ABC. 


(b) Since 
m,+m,+m, 3 6T 
3. *mittmttm at+bte 
by (4), we conclude from (6) that 
R,+R,+R.> 6r. (8) 


(c) Applying (6) and (7) to the circumcenter (so that R, = R, = R, = R) we 
obtain 


9r<m,+m, +m, <45R. (9) 


In particular, this implies the well-known inequality R > 2r. Note that this last 
inequality also follows at once from (8) applied to the circumcenter. 
(d) Let us give finally a direct proof (in the same spirit) of the inequality 
R > 2r, without using the Erdés-Mordell theorem. It follows from (5) that 
aR, + OR, +cR, + ar, + br, + cr, > am, + bm, + cm, = 6T. 
Applying (3) and (4) we conclude that 
aR, + OR, + cR. , 10 
at+bt+c =a (10) 


Like (6), this inequality also remains valid even if P is outside the triangle. 
Applying it to the circumcenter, the inequality R > 2r follows. 
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Inverting the Difference of Hilbert 
Space Projections 


Don Buckholtz 


Let R and K be subspaces of a Hilbert space H, and let P, and P, denote the 
orthogonal projections of H onto these subspaces. When is the operator Pp — P, 
invertible? We show here that the obvious necessary condition, H = R @ K, is 
sufficient as well. We also find the inverse. 


Theorem. Let R and K be subspaces of a Hilbert space H, and let Pp and P,, denote 
the orthogonal projections of H onto these subspaces. The following are equivalent: 


(i) The operator P, — P, is invertible. 
(ii) H is the direct sum of R and K. 
(iii) There exists a linear idempotent M with range R and kernel K. 


If Pz — Px is invertible, then (Pp — Pe)’ =M+ M* — 1. 


Proof: The equivalence of (ii) and (iii) is well known and easy to prove. What 
needs to be shown is that (i) and (iii) are equivalent. Suppose first that there exists 
an idempotent M with range R and kernel K. We have MP, = 0 and P,M = M. 
Since J — M is idempotent with range K and kernel R, we have the corresponding 
results (J — M)P, = 0 and P,(J — M) =I — M. Therefore 


(M + M* —1)(Pp — Pg) = (PpM + Py(I — M))* — (I — M)P, — MP, 
=(M+I-M)* =I. 
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Taking the adjoint yields (P, — P, (M + M* — J) = J; therefore, (Pp — P,)~' = 
M+ M* —-T. 


It remains to show that the invertibility of P, — P, implies the existence of an 
idempotent M with range R and kernel K. To obtain this result, premultiply and 
postmultiply the identity 


(Pr — Px) Pr = Ul — Px)(PR - Px) 
by (Pz — P,)~' and call the resulting operator M. We shall show that 
M = Pp( Pr — Px) = (Pr — Px) ‘(1 - Px) 


is an idempotent with range R and kernel K. That M has range R is a 
consequence of the first expression for M; from the second expression it follows 
that M has kernel K. 

To establish that M is idempotent, note that 


—1 —] —] 
M —I=Pp(Pr- Px) — (Pr- Px)(Pr 7 PK) = Px(Preo- Px). 
Using the fact that (J — P,)P, = 0, we obtain 
M* —M=M(M—1) = (Pp- Px) (1 — Px) Pe (Pep — Px) = 0. 
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THE EVOLUTION OF... 


Edited by Abe Shenitzer 
Mathematics, York University, North York, Ontario M3J 1P3, Canada 


Glimpses of Algebraic Geometry’ 


I. G. Bashmakova and E. I. Slavutin 


PLANE ALGEBRAIC CURVES. Consider the equation 


F(x,y) = 0, (1) 


where F(x, y) is a polynomial with rational coefficients that is irreducible over the 
field Q@ of rational numbers. The set of points of the real plane R? whose 
coordinates satisfy the equation (1) is called a plane (rational) algebraic curve. If F 
is linear then we speak of a rational line. The points with rational coordinates are 
called rational points. 

By the order of the curve I’ defined by equation (1) we mean the degree n of 
the polynomial F(x, y). The number of points of intersection of I and an arbitrary 
line Ax + By + C =0 is exactly n. When counting the number of points of 
intersection we must consider multiplicities, complex points, and points at infinity. 
We give a few illustrative examples. 


a. The curve x* + y? =1 and the straight line x + y = 10 intersect in two 
complex points; 

b. The curve y’? = 1 —x°? and the straight line y = 1 have the triple point of 
intersection P(0, 1); 

(Remark. For a discussion of singular and multiple points see [1] and [5]. 

(Trans. )) 

c. The curve y* = 4x? +x + 2 and the straight line y = 2x have two points of 
intersection, namely the point M(—2, —4) and a point at infinity. 


In order to define points at infinity we must introduce homogeneous coordi- 
nates, that is, essentially, we must go from the real plane R* to the projective 
plane P*. A point of the projective plane is given by an ordered triple of real 
numbers (u, v, w) not all of which are 0. Proportional triples define the same point. 
The numbers in a triple (u, v, w) are called homogeneous coordinates in P?. 

We now determine a (partial) correspondence between the points on R* and on 
P*. Let (u, v,w) be a point on P’. If w # 0, then (u/w, v/w, 1) determines the 
same point on P’. We associate with it the point on R* with coordinates x = u/w, 
y = v/w. If w = 0, then the point (u, v, 0) has no “partner” on R*. We call such 
points points at infinity. All points at infinity lie on the line at infinity w = 0. 


‘This is a translation (by Abe Shenitzer, who also titled the piece) of part of the introduction to the 
monograph by Bashmakova and Slavutin titled A History of Diophantine Analysis from Diophantus to 
Fermat, published in 1984. 
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In order to change equation (1) to an equation in homogeneous coordinates we 
put x =u/w, y =v /w. After obvious simplifications we obtain a homogeneous 
equation of the form 


P(u,v,w) = 0. (2) 


Now points at infinity have the same status as ordinary points. 

In terms of homogeneous coordinates, our curve y* = 4x7 +x +2 has the 
equation v* = 4u? + uw + w?. Putting w = 0, we obtain its two rational points at 
infinity M,(1, 2,0) and M,(1 —2,0). The line v = 2u passes through the point M,. 
This is its second point of intersection with our curve. 

The classification of curves by order is of great significance. It was introduced 
by Descartes (who put in the same class curves of order 2n and 2” — 1) and made 
more precise by Newton. 

The fundamental theorem related to order is due to Bezout. It states that the 
number of points of intersection of a curve of order m and a curve of order n is mn, Of 
course, here we must take into consideration multiplicities, complex points, and 
points at infinity. 

Notwithstanding its importance, the classification of curves by order alone is 
rather crude for purposes of diophantine analysis. Two curves of the same order 
can have very very different sets of rational points. Thus the curve [’ with equation 
x* + y? = 1 has infinitely many rational points (with coordinates x = 2k /(k* + 1), 
y = (k* — 1)/(k* + 1), k rational), whereas the curve x? + y? = 3 has none. 

The notion of greatest importance for diophantine analysis is that of birational 
equivalence of curves. 


Definition 1. Two curves f(x, y) and g(u, v) are said to be birationally equivalent if 
the coordinates of each of them are expressible in terms of the coordinates of the 
other as rational functions with rational coefficients: 

x = g(u,Dv), u= ~,(x,y), 

y=(u,v), v= 4,(%,y). 

It is clear that the respective sets of rational points of two birationally equiva- 
lent curves coincide with the possible exception of a finite number of points. 
Birationally equivalent curves can have different orders, that is, the order of a 
curve is not a birational invariant. For example, the quartic curve 

ye =xt — x2 + 2x = (x -1)(x° + 2) 
can be transformed by means of the substitution 


x=(l+u)/uy y=v/ 
into the cubic 
v? = 3u? + 3u? + 3u +1, 
with u and v rationally expressible in terms of x and y: 


u=1/(x —- 1), v=y/(x — 1)’. 

We will see that a quadratic curve with at least one rational point is birationally 
equivalent to a rational straight line. 

It was Henri Poincaré who first called attention to the fundamental significance 
of birational transformations in the study of the arithmetic of algebraic curves. In 
the introduction to his famous paper “On the arithmetical properties of algebraic 
curves” he wrote: “I asked myself if it is not possible to connect many problems of 
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analysis on a systematic basis by introducing a new classification of homogeneous 
polynomials of higher order, analogous in a sense to the classification of quadratic 
forms. 

This classification would have to be built on the foundation of the group of 
birational transformations admitted by the algebraic curve.” [2] 

One of the basic invariants of the group of birational transformations is the 
genus of a curve. To define it, we introduce first the notion of the simplest double 
point on a curve. 

The singular points on a curve [ given by (1) are the points whose coordinates 
satisfy the equations 


f(x,y) =0, f(x,y) =0. 


An algebraic curve has only finitely many such points. A singular point P(%p, y,) is 
called a double point if at least one of the second partial derivatives f,,, f,,, and 
f,y does not vanish at P. Finally, a simplest double point is a double point at which 
the curve has two noncoincident tangents (see Figure 1). When defining the genus 
of a curve we will assume that its only singular points are simplest double points. 
This is not a serious restriction, for it can be shown that an algebraic curve is 
birationally equivalent to one with only simplest double points. 

We can now define the genus of a curve. 

Definition 2. By the genus of a plane algebraic curve I’ of order » we mean the 
number 


_ (n= l(n—2) | 
7 2 


where d is the number of simplest double points on the curve. 

It is clear that g is an integer. It can be shown that g > 0. If the order is 1 or 2, 
then g = 0. Such curves are called rational. The reason for this is that if a curve [ 
of genus 0 with equation 


d, (3) 


F(x,y) =0 


has a rational point P(x, y,), then the coordinates x and y can be expressed in 
the form x = g(t), y = W(t), where ¢ and w are rational functions with rational 
coefficients and F(¢g(t), w(t)) = 0. Moreover, t = y(x, y), where y is also a 
rational function with rational coefficients. 

One also says that curves of genus 0 can be uniformized by means of rational 
functions. 

If n = 1, that is, in the case of straight lines, it is clear that any two rational 
Straight lines Ax + By + C =Oand A,x + B,y + C, = 0 are birationally equiva- 
lent, that is, there is just one class of birationally equivalent straight lines. 

If n = 2, that is, if the curve is a conic section, and if there is a rational point 
P(x, Yq) on it, then the curve is birationally equivalent to a straight line. To see this 
it suffices to take an arbitrary rational straight line D and to establish a one-to-one 
correspondence between the points M on the conic and the points M’ on D so 
that the points P, M, M’ are collinear. Since every conic with a rational point is 
equivalent to a rational line, all such conics are (birationally) equivalent to each 
other, and so form a single class that includes all rational lines. This implies that if 
a conic has a rational point then it has infinitely many rational points. 

There are a great many equivalence classes of conics without rational points. 

Poincaré proved the following theorem: “Every curve of genus 0 and order 
n > 2 is birationally equivalent to a curve of order n — 2.” (Hilbert and Hurwitz 
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proved a similar result 10 years earlier; see [3].) Hence a rational curve of genus 0 
is always equivalent to a straight line or to a conic. 

If a cubic curve has genus 0, then 

(3 — 1)(3 — 2) 
2 

that is, d = 1. But then the curve must have a simplest double point which is 
clearly rational. A straight line passing through the double point P intersects the 
curve I’ in just one other point. We will show that in this case the cubic I’ is 
birationally equivalent to a rational line. To this end we take a rational line D and 
establish a one-to-one correspondence between the points M on I and the points 
M' on D so that the three points M, M’, and the double point P are collinear (see 
Figure 2). 


d= 0, 


Figure 1 Figure 2 


This shows that a cubic curve [' with a simplest double point is birationally 
equivalent to a rational line. As such, it can be uniformized by means of rational 
functions. For example, P(0, 0) is a double point on the curve y* = x° — 2x’. If we 
pass through P the lines y = kx, then we obtain k*x* =x? — 2x’, whence 
x=k?4+2, y=k(k? + 2). 

Now we consider curves of genus 1. 

It can be shown that curves of genus 1 cannot be uniformized by means of 
rational functions but can be represented by means of elliptic functions of one 
variable. Hence the name elliptic 1s attached to such curves. 

If an equation 

f(x,y) = 0 (4) 
determines a curve I of genus 1 with a rational point P(%p, yo), then it is possible 
to reduce it by means of birational transformations to the form 


yi =x? +actb. (5) 
This is the so-called Weierstrass normal form. In this case it is possible to express x 
and y in terms of the Weierstrass functions 
x= g(t), y=g'(t). 

Thus the coordinates of the rational points on a cubic curve cannot, in general, 
be expressed as rational functions of a single parameter. However, if we know one 
or two rational points on such a curve, then we can find yet another rational point 
on it. To do this one makes use of two methods, known, respectively, as the method 
of tangents and the method of secants. 


1. If P isa rational point on a cubic I, then by drawing at P the tangent to 
[we obtain a rational straight line (the slope of this line is rational) that 
intersects Tin a third rational point. This is the method of tangents. 
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2. If P, and P, are rational points on I, then the rational line P,P, intersects 
I’ in a third rational point P;. This is the method of secants. 


The fundamental theorem about curves of genus 1 was proved by Poincaré. It 
asserts that: Every rational curve of genus 1 with a rational point is birationally 
equivalent to a cubic curve. 

Thus cubics are a model for the study of the arithmetic of curves of genus 1. 

Let 4 be the set of rational points on an elliptic curve. Using the tangent and 
secant methods it is possible to impose on it the structure of an abelian group. In 
essence, this was done by Jacobi in 1835 in [4]. A deeper study of this group was 
carried out by Poincaré [2], who surmised that this group has a finite number of 
generators. He called this number the rank of the cubic curve. It was later shown 
that the rank of a curve is an invariant of the group of birational transformations. 
Poincaré posed the question of the possible values of the rank of a cubic curve. 
This question remains open. The English mathematician L. J. Mordell proved the 
deep result that the rank of an elliptic curve is always finite. 

Poincaré showed that the group of an elliptic curve can contain elements of 
finite order (that is, that it is a group with torsion). In essence, this was already 
known to Fermat and Euler. 

We conclude by considering the geometric sense of the group operations 
associated with the method of secants and the method of tangents. We first reduce 
a cubic curve I with rational points to the form (5). Let A and B be rational 
points on IT and let C’ be the point in which the straight line AB intersects I. 
Then we call the point C, symmetric to C’ with respect to the x-axis, the sum of 
the points A and B: 


A®B=C. 


Thus if C’ has coordinates (x, y), then C will have coordinates (x —y). The 
transition from C” to C is of vital importance. It is only then that the operation of 
addition acquires the group properties, namely associativity, the existence of a zero 
element, and the existence of an additive inverse for each of its elements. The 
commutativity property of our operation is obvious. 

To add A to itself, that is, to obtain 2A, we use the method of tangents. We 
define the point 2A = D to be the point symmetric to the point D’ in which the 
tangent at A intersects I. 

It remains to find the point that plays the role of the zero element. No finite 
point will do. When we go to homogeneous coordinates by putting x = u/w, 
y =v /w, (5) becomes 


(6) v°w =u? + auw’ + bw?, 


If w and u are 0, then v is arbitrary and we can put v = 1. We denote by @ the 
point at infinity on I with coordinates (0,1,0). It is clear that the point @, 
symmetric to the point @ with respect to the axis of abscissas, coincides with @. 

We show that @ plays the role of zero. To this end we show that all vertical 
lines u = mw intersect at @. Indeed, if w and wu are 0, then we can put v = 1. 
Now let A(x), y,) be a rational point on [’. Then, according to what has just been 
shown, the straight line through A and @ is vertical, that is, its equation is x = xo. 
This straight line intersects [ in the three points A, @, and A’(x), —y,), the latter 
symmetric to A with respect to the x-axis. According to our definition, the sum 
of the points A and @ is the point symmetric to A’, that is, A itself. Thus 
A ®C=A. 
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Finally, the inverse of A is A’(x,), —y,). Indeed, the straight line joining these 
two points is vertical, and so intersects [ at @, that is, A ® A’ = @. 

Points of finite order are characterized by the fact that nA — A for some n, that 
is, there is a return to the initial point. 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, M. J. Pelling, Richard Pfiefer, Leonard Smiley, John Henry 
Steelman, Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before June 30, 1997; Additional information, such as generalizations and refer- 


ences, is welcome. The problem number and the solver’s name and address should 
appear on each solution. An acknowledgement will be sent only if a mailing label 
is provided. An asterisk (*) after the number of a problem or a part of a problem 
indicates that no solution is currently available. 


PROBLEMS 


10564. Proposed by Aviezri Fraenkel, Weizmann Institute of Science, Rehovot, Israel. 
The Nim-sum of two positive integers with binary expansions )~,,..a;2' and )~,..9 bj2! 
is the number with binary expansion )°,., cj2', where aj, bj, c; are in {0,1} andc; = 
a; +b; mod 2. Let n be a positive integer and let j be a nonnegative integer. How many of 
the 2” subsets of the set {1, 2,...,} have the property that their elements have Nim-sum 
equal to j? 


10565. Proposed by D. M. Bloom, Brooklyn College, Brooklyn, New York and Kenneth 
Suman, Winona State University, Winona, MN. A rectangle is composed of mn squares 
arranged in m rows andzx columns. In acertain game, the squares are selected one by one at 
random (without replacement). What is the expected number of selections until 7 columns 
of the rectangle are composed entirely of selected squares? (When j = 1, m = 5, and 
n = 15, the problem asks for the expected length of a type of bingo game known as a line 
game.) 


10566. Proposed by Gerry Myerson, Macquarie University, Australia. Let S be a finite set 
of cardinality n > 1. Let f be a real-valued function on the power set of S, and suppose 
f(AN B) = min { f(A), f(B)} for all subsets A and B of S. Prove that 


Yop" 4! F(A) = FS) — max f(A), 
where the sum is taken over all subsets A of S and the maximum is taken over all proper 
subsets A of S. 


10567. Proposed by Donald Girod, Canisius College, Buffalo, NY. Let f : [0,1] ~ R 
be a continuous function with f(0) = f(1) = 0. Show that the Lebesgue measure of 
{h: f(x +h) = f(x) for some x € [0, 1] } is at least 1/2. 


10568. Proposed by Donald E. Knuth, Stanford University, Stanford, CA. Let n be a 
nonnegative integer. The sequence defined by xg = n and x44.) = xy — | /XK | fork > 0 
converges to 0. Let f(n) be the number of steps required; i.e., xf(n) = O but x¢~m)-1 > O. 
Find a closed form for f (7). 
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10569. Proposed by W. M. Priestley, University of the South, Sewanee, TN. Let X and Y 
be countable subsets of real numbers (each endowed with the subspace topology). If there 
exist one-to-one continuous maps of X onto Y and of Y onto X, does it follow that X and 
Y are homeomorphic? 


10570. Proposed by Emeric Deutsch, Polytechnic University, Brooklyn, NY. An ordered tree 
is a rooted tree in which the children of each node form a sequence rather than a set. The 
height of an ordered tree is the number of edges on a path of maximum length starting at the 
root. Let a(n, k) denote the number of ordered trees with n edges and height k and let S(n, k) 
be the Stirling numbers of the second kind (the number of partitions of {1,2,...,} into 
k nonempty parts). Note that a(n, 1) = S(n, 1), since both numbers are 1. Show that (a) 
a(n, 2) = S(n, 2), (b) a(n, 3) + a(n, 4) = S(n, 3), and (c)* generalize these observations. 


SOLUTIONS 


An Arithmetic Function of Modest Size 


10192 [1992, 61]. Proposed by the late Paul Erdés, Hungarian Academy of Sciences, 
Budapest, Hungary. Let L(n) denote the least common multiple of the positive integers not 
exceeding n. For n > 2 let g(n) denote the largest positive integer k such that n* | Ln). 
For example, ¢(2) = 1, g(30) = 2, g(420) = 3. Prove that for x large 


max g(n) = log x/{log log x + o(1)}. 
2<n<x 


Solution by the editors based on the solutions of Richard Stong, Rice University, Houston, 
TX, and the proposer. We use the prime number theorem in the following two forms 
t O(t 
mO OO) _ 


im = |, im — =l, 
too t/logt ico f 


as well as Chebyshev’s theorem that 7(27) > m(j) for j = 1, 2, 3,.... Here w(t) is the 
number of prime numbers not exceeding ¢ and 6(t) is the sum of the logarithms of the primes 
not exceeding f. 

Clearly 


L(n) = I] plloen/log pl 
Pp 


where p runs through the prime numbers not exceeding n. Thus if p* | n but p®*! tn for 

some prime p, and if n® | L(n), then ga < logn/log p or g < logn/ log p*. Hence if q, 

is the largest prime power dividing n, we have g(n) = |logn/log q,| < w(n), where w(n) 

is the number of distinct prime factors of n. (Actually g(n) < w(n) whenever w(n) > 1.) 
Now suppose we are given a positive number €. If g, > e~* logn, then 


logn logn logn 
g(n) = < oom 
log an log(e~‘ logn) loglogn —e 


If gn < e £logn, then g(n) < w(n) < m(qn) < m(e‘ logn). Since e* < 1, the 
prime number theorem gives m(e~‘ logn) < logn/loglogn if n is sufficiently large. 
Thus in either case g(n) < logn/(loglogn — e€), provided n is sufficiently large. Since 
log x /(log log x — €) is an increasing function of x for x > expexp(1 + €), we obtain 


max g(n) < logx/(loglogx — €). (1) 


2<n< 


for large x. 
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To get a lower bound let pj, p2, ... be the primes in increasing order. If x is a positive 
number greater than 2, let k be the largest integer such that p, p2--- px < x, i.e., such that 
O(px) < logx. If m = p\p2--: px, then g(m) = [logm/ log px]. Since x < pgyim < 
2prm < pim, we have logx < logm + 2log px, so that g(m) > (logm/ log px) — 1 > 
(log x/log px) — 3. If x is large, then k is large and so the prime number theorem gives 
logx > 0(px) > e~€/* px or log py < loglog x + €/2. Hence 


max g(n) > g(m) 
2<n<x 


> logx / (loglogx + €/2) —3 (2) 
> logx / (loglogx +e), 


provided x is sufficiently large. In view of (1) and (2), the assertion of the problem is 
established. 
Although g(n) is sometimes much smaller than w(n), a similar argument gives 
log x 
max w(n) = —————_—_—_—_ . 
2<n<x log log x — 1+ 0(1) 
While the value of n in the interval [2, x] for which w(n) is largest is obvious, such is not 
the case for g(n). 
Paul Erdés reminded the editors that in his original proposal he put forward only the 
weaker assertion 


max g(n) = (1+ 0(1)) logx / log log x, 
2<n<x 


which is a little easier to prove than the assertion of the problem as published. Erdés also 
remarked that it would be of interest to determine the sequence of “champions” for the 
arithmetical function g, i.e., to determine for each k the smallest integer n, greater than 1 
for which g(ny) = k. For example, ny = 2, nz = 30, n3 = 420, ng = 27720. Since 
g(n) < w(n) when n is not a prime power, nx must have at least k + 1 prime factors when 
k> 1. 


Solved also by L. E. Mattics. 


World Series, 1994 


10223 [1992, 462]. Proposed by Julio Kuplinsky, Amherst, NY. For p € R, gq = 1 — p, and 
positive integers n, prove 


2n—1 
k—1 
) ( ) [eta + pi-"q"] — 1. 


Editorial comment. The challenge here was to give a proof that is valid for all p € R, 
since problem E3386 [1990, 427; 1992, 22] obtained this result for 0 < p < 1 through a 
probabilistic interpretation. Several readers noted that the sum on the left side of the desired 
identity defines a polynomial in p, so the general identity follows from its truth on any 
infinite set. Proofs avoiding the probabilistic interpretation can be constructed by rewriting 
(‘—) as (*) — (‘—1) and rearranging terms to allow the sum for n to be compared to that 
forn+ 1. 

Frank Schmidt has noted that the result also follows from the additional remarks to the 


solution of E1829 [1965, 1201; 1967, 323; 1967, 1134], establishing the identity 


= k—1 n —n = r r— 
E (fata af (eo 


k=n k=n 
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for allr > n. Problem E2681 [1977, 728; 1979, 129] is also related. Murray S. Klamkin 
pointed out that a general two-variable form appears in Crux Mathematicorum, Problem 
183 [1976, 193; 1977, 69], and a form with an arbitrary number of variables appears in 
SIAM Review, Problem 85-10, 28 (1986), p. 243. Furthermore, a solution to a variant of 
this problem has already appeared in these pages: see Doron Zeilberger, On an identity of 
Daubechies, this MONTHLY 100 (1993), 487. Earlier appearances of the problem include 
solutions of the two types outlined. 


Solved by J. Anglesio (France), D. Beckwith, J. C. Binz (Switzerland), M. Bowron, D. Callan, R. J. Chapman (U. K.), 
W. Chu (China), P. Deiermann, S. B. Ekhad , J. Fukuta (Japan), D. A. Grable, C. P. Grant, P. Griffin, H. van Haeringen 
(The Netherlands), R. Holzsager, W. K. Jeong (Korea), A. M. Karparvar (Iran), M. S. Klamkin (Canada), B. G. Klein, 
N. Komanda, D. W. Koster, I. I. Kotlarski, R. A. Leslie, O. P. Lossers (The Netherlands), M. Mécsy (Hungary), I. Nemes 
(Austria), K. Perera, C. R. Pranesachar (India), P. Ranaldi, R. W. Richards, R. Richberg (Germany), J. B. Robertson, 
F, Schmidt, J. H. Steelman, H. L. Stubbs, L. Verde-Star (Mexico), M. Vowe (Switzerland), H. S. Wilf, Centre Problem 
Solving Group, and the proposer. 


Primitive Elements Modulo Primes and Their Squares 


10311 [1993, 499]. Proposed by Solomon W. Golomb, University of Southern California, 
Los Angeles, CA. It is well-known that if g is a primitive root modulo p, where p > 2 is 
prime, either g or g + p (or both) is a primitive root modulo p? (indeed modulo p* for all 
k > 1.) 

(a) Find an example of a prime p > 2 and a primitive root g modulo p with 1 < g < p 
such that g is not a primitive root modulo p?. 

(b) Show that, among all ¢(p — 1) primitive roots g modulo p with 1 < g < p, at least 
half of them are also primitive roots modulo p?. 


Editorial comment. For part (a), the most popular solution was (p, g) = (29, 14) for which 
p is minimal. The example (p, g) = (40487, 5) (see E. L. Litver & G. E. Judina, Primitive 
roots for the first million primes and their powers, Mathematical Analysis and Its Applica- 
tions 3 (1971), 106-109, Izdat. Rostov. Univ.) was given as the only example with p less 
than one million in which g is the smallest primitive root modulo p. Other examples given 
by readers were (p, g) = (71, 11) , (37, 18) and (487, 10). The latter example appears in 
several books (e.g., D. Shanks, Solved and Unsolved Problems in Number Theory, Chelsea, 
1985, ex. 79, p. 102). 

A weaker version of (b) appeared in Math. Magazine as Problem 1419 [1993, 126; 
1994, 148]. The solution published there could be modified to solve this problem. That 
argument is similar to one found in Alfred Brauer, Elementary estimates for the least primi- 
tive root, Studies in mathematics and mechanics presented to Richard von Mises, Academic 
Press, 1954, 20-29, where the result appears as Theorem 7. From there, the method can 
be traced back to V. A. Lebesgue, Théoréme sur les racines primitives, Comptes Rendus 
Acad. Sci. Paris 64 (1867), 1268-1269. Following a suggestion of Paul Bateman, we reprint 
Lebesgue’s exact words. 

Soit g < p une racine primitive pour le module premier p; soit encore g’ < p et 

g’ = gP-? (mod. p). Le nombre g’ < p sera aussi racine primitive. Ces racines 

g, 2’, Satisfaisant a la condition gg’ = 1 (mod. p), sont associées. L’une d’elles au 

moins est racine primitive pour le module p", quel que soit l’exposant n. 

No proof is given there, but this statement was introduced with the words: “La démon- 
station ne présente pas de difficultés.” 

Published tables are useful in finding the examples requested in (a). In particular, for 
g < 100 and p < 232 one need identify only the primitive roots among the values given 
in Peter L. Montgomery, New solutions of aP-! =] (mod p’), Math. Comp. 61 (1993), 
361-363. This is an easy exercise for a computer algebra system. Thus, the pair (p, g) = 
(113, 68) is seen to be an example of a primitive root g modulo p and g?~! = 1 (mod p?). 
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This problem suggests study of the set G(p) of numbers g that are primitive roots modulo 
p with 0 < g < p, but fail to be primitive roots modulo p*. Some readers included the 
results of computational work determining G(p) for all p in certain intervals. In particular, 
Albert Wassermann included a complete table of the sets G(p) with p < 1000, and John P. 
Robertson summarized a computer search of 2 < p < 20000. This range was divided 
into eight subintervals and the number of p in each subinterval with each possible size of 
G(p) was given. Some noteworthy examples of G(p) were also included. For example, 
G(653) = {84, 120, 287, 410} and G(16631) = {274, 11047, 14697, 16026} are the only 
examples of #(G(p) ) > 4 in this range. 

Numerical evidence suggests that #( G( p)) is really much smaller than @(p — 1)/2. 
Such a result is known. Paul Bateman also provided a reference to S. D. Cohen, R. W. 
K. Odoni, and W. W. Strothers, On the least primitive root modulo p*, Bull. London Math. 
Soc. 6 (1974), 42-46, where it is shown that, for any c > 1/2, there is a quantity P(c) such 
that p > P(c) implies that #(G(p)) < p*°. 

Solved by D. Alvis, R. Barbara (Lebanon), P. T. Bateman & W. P. Wardlaw, K. A. Beres, V. BoZzin (Yugoslavia), D. Callan 
(part b only), R. J. Chapman (U. K.), J. Christopher, H. M. Edgar , H. S. Gunaratne (Brunei), W. Johnson, I. Kastanas, 
D. W. Koster, Y. H. Kwong, D. E. Manes, M. Newman (Israel, part b only), J. P. Robertson, R. M. Robinson, H. Schmidt Jr. 


(part a only), R. Simion & F. Schmidt, J. H. Steelman, A. Wassermann (Germany, part a only), GCHQ Problem Solving 
Group (U. K.), the MMRS group of Oklahoma State University, and the proposer. 


Asymptotics in Three Parts 


10335 [1993, 797]. Proposed by David Borwein, University of Western Ontario, Lon- 
don, Ontario, Canada, and Jonathan Borwein, Simon Fraser University, Burnaby, British 
Columbia, Canada. Let r be a positive constant and co > O. Consider the iteration 
Cn41 = Cn tr—Cn/V1t+ ce. (a) For which values of r does the sequence (c,,) converge? 
(b) In case of convergence to c with c # co, prove that lim(c,41 — c)/(Cn — c) exists and 
determine its value. (c) In case of divergence, find an asymptotic expression for cp. 


Solution forr # 1 by Heinz-Jiirgen Seiffert, Berlin, Germany. More generally, for r > 0, 
k > 0, and co > 0, we consider the iteration 
Cn 
Cn+] = Cn +r — ———. 
(1 + ck) i 


(a) If (c,) converges to c, thenr =c / (1 + chy) ‘ < 1. Hence, the condition 0 <r < |Lis 


necessary for the convergence of (c,,). We show that it is also sufficient. LetO <r < 1 and 
define M = max {co, r+ r/( ~ rye. Consider the mapping T : [0, M] — R defined 


by T(x) =x+r —x/(14+x*)'/" Iif#o<x<r/(l — pk) then0 <r < T(x) < 
x+r<M. Ifr/(1 — kil <x < M,then0O <r < T(x) < x < M. Thus, we have 
T({0, M]) ¢ [0, M]. Furthermore, for all x € (0, M], 


1 1 
0<T'(«) =1- ——_,,,, s< !- >: 
(1 + xk) tilt (1+ me) tilt 
so T ig acontraction. Since cg € [0, M] and c,4; = T(c,), the contraction mapping 
principle implies that (c,) converges to the unique fixed point c € [0,M] of T. The 


equation T(c) = c is easily solved, giving c = r/(1 — pkyl/ 7 


(b) LetO <r <1landc =r/(1 — pki # cg. Since T’(x) > 0, all c, 4 c and 


— T —T 
lim tl = © = jim Pen) — TC) — T'(c) 
N—->0O Cy —C n—>0o Cn —C 
; 1 ; (1 “ne 
= ] — ——— =] — —r 
(1+ ck)" 
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(c) Letr > 1 andk > O. Using the obvious inequality cy4; > ch +r — 1, forn > 0, 
an easy induction gives c, > (r — 1)n for alln > 0. Now, 1 — en (1 + chy l/" = 


1-(1 + ad ae = O(c;,*), forn > 1, and it now follows that c,4) = cn, +r—1+O0O(n-*). 
Hence, for all n > 2, 


(r—1)n+ O(n!) if0<k <1, 
Cn =4(r—1)n+Ologn) ifk =1, 
(r —1)n+ O(1) ifk > 1. 
Thus forr > 1 andk > 0, we havec, ~ (r — 1)nasn > ov. 
When r = 1, we conjecture that for all k > 0, cn ~ (kK + 1)n/k) 4+) asin —> OO. 


Solution for r = 1 (and k = 2) by Robert D. Brown & Pawel Szeptycki, University of 

Kansas, Lawrence, KS. We show that c,n7!/3 > (1.5)!/ 3 asin — 00 (this can be guessed 

by considering the differential equation x’ = 0.5x~? suggested by the following estimates). 
We have 


Cnt+1 = Cn + 


1 
J1 +oa(/1 +c? + ¢n) 


and so L(¢n) < Cna1 < R(cg) with L(x) = x +1/(2 + 2x?) and R(x) =x + 1/(2x*). 
Both L(x) and R(x) are increasing for x > 1. Let € > 0. Choose n¢ so that 
1 — 2€/3 l 
a 
x27/3 0 ~ 1 4 x2/3 


for x > (1.5 — €)n_, and then choose m, so that cm, > 3/(1.5 — €)n,.. Use induction on n 


to show that 
Chime > V(15 — €)(n + ne) (*) 


for all n > 0; the case n = 0 follows from our definitions. Suppose (*) holds for n, and let 
x =(1.5—e)(n+n,). The case n + 1 of (*) holds if 7/x + 1.5 —€ < L(3/x). However, 
since /a +b — 3/a <b / ( 3 Va2 ) for a,b > 0, this follows from the choice of n,. Thus 
lim infn—soo Cn/ 3/n => 1.5 —€. Similarly, choose ke so that co < (15+ )ke and 
(1+ 2/3) /(x +1.5+6€)?3 > x-2/5 for all x > (1.5 + €)ke. A similar induction shows 
that c, < YU1.5+€)(n + ke). Thus lim sup,_,.9¢n/3¥/n < V1.5 +e. 


Editorial comment. Kiran Kedlaya used an asymptotic expansion of T(x) to determine a 
differential equation that could be used to guess the asymptotic behavior of c, forr = 1. 
Allan Pedersen formulated 


Lemma. Let a sequence (x;,) be defined by x9 = co > 0 and Xn41 = Xn + 8(Xn) forn => 0, 
where g(x) is acontinuous, positive, decreasing function forx > 0. Let x(t) be the solution 
to the initial value problem 
dx 
0)=co, — = ,t>0. 
x(O) = co it g(x), t= 
Then x, — oo and x, = x(n) + O(1) asn > ow. 


Proof. See MONTHLY Problem 6610 [1989, 744; 1991, 448]. 

Although an explicit solution of the differential equation is possible in this case, its main 
use is to obtain an asymptotic expression. O. P. Lossers remarked that for r > 1, one can 
prove the validity of an expansion 


OO 
Ch =n(r—-1l)t+dt+ San 
k=1 
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as n —> ©oO, whose coefficients could be calculated by substitution in the recurrence. A 
similar process for r = 1 gives 


3 27 
3_ 7, 4,13 


"2 8 


where K is aconstant. 


8 
c + 5 logn+ K + O(n’) 


Solved also by J. Anglesio (France), P. Bracken (Canada), R. J. Chapman (U. K.), D. A. Darling, D. Doster, M. Dresevié 
(Yugoslavia), N. Eklund, J. Ferrer (Spain), R. A. Groeneveld, H. S. Gunaratne (Brunei), R. Holzsager, K. S. Kedlaya, 
P. G. Kirmser, E. H. Larson, K.-W. Lau (Hong Kong), O. P. Lossers (The Netherlands), B. Margolis (France), C. A. 
Minh, A. Pedersen (Denmark), I. A. Sakmar (Turkey), K. Schilling, M. Vowe (Switzerland), Z. Zhang (China), GCHQ 
Problem Solving Group (U. K.), NSA Problems Group, and the proposer. 


Expected Number of Sums in a Given Set 


10336 [1993, 797]. Proposed by Ignacy I. Kotlarski, Oklahoma State University, Stillwater, 
OK. Let X1, X2, ... be a sequence of independent identically distributed random variables, 
each exponentially distributed with parameter a,a > 0,1.e., fork = 1,2,..., 


0 if x < 0, 


Pr(Xk <=] pas ifx > 0. 


Let B be a fixed Borel set in [0, co) such that its Lebesgue measure 1,,(B) is finite and 
positive. Let ¥y = X; +---+ X, fork = 1,2,...,and@ = )-72, Pr(Y, € B). 

(a) Find @ as a function of a. 

(b) Find a uniform minimum variance unbiased estimator of 9 from a sample from the above 
exponential distribution of a fixed size n. 


Solution I of (a) by Robert A. Agnew, FMC Corporation, Chicago, IL.0 = au (B). Itis well 
known that the probability density function of Y; on [0, 00) is f;(y) = a* yk—le-ay /(k — 1)! 
(the gamma distribution). Hence Pr(¥, € B) = fp f(y) dy and 


6 -y | fey) dy - | >| fe(y) dy = | ady = ania), 
k=1 78 B p=] B 


Solution II of (a) by Kenneth Schilling, University of Michigan, Flint, MI. We prove that 
6 = apy(B). The function 6(B) = yaa Pr(Y,x € B) is acountably additive measure on 
the Borel sets, so it suffices to prove the claim when B is the interval [0, t) for t > 0. 

Let {Z;},>9 be the Poisson process whose interarrival times are {X), X2,...}, that is, 
Z; = max {k: Yy < t}. Fort > O, the random variable Z; has a Poisson distribution with 
mean at. Thus, if B = [0, t), we have 


6(B) =) Pr(Y <t)=E (> ios) = E(Z,) = at. 
k=1 


k=1 


Solution of (b) by Markus Roters, University of Trier, Trier, Germany. For each n € N, the 
random‘variable Y, has an Erlang distribution with parameters n and a, and is a complete 
and sufficient statistic for the family of joint distributions of X,,..., X, indexed by the 
parameter a. Hence the Rao-Blackwell and Lehmann-Scheffé theorems imply that if an 
unbiased estimator for 0 exists, there also exists one depending on Y,, which is then a 
uniform minimum variance unbiased estimator (UMVUE). 

For n = 1, there is no unbiased estimator for 6. Indeed, if d(X,) = d(Y,) were such an 
unbiased estimator, it would follow for all a > O that 


E (d(¥1)) = 6 if and only it [ 
! 


0,00 


Aye dui (y) = u(B), 
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and hence, by differentiating with respect to a, for alla > 0, 
| (—yd(y) )e~’'dut(y) = 0. 
[0,00) 


But now, by completeness, we could conclude d(Y;) = 0 a.s., an impossibility. 

However, for n > 2, one computes d*(Y,,) = (n — 1)u1(B)/Y, is unbiased for 6 and 
hence is the desired UMVUE. In fact, for n = 2, any unbiased estimator Z must have 
infinite variance. Otherwise, by Rao-Blackwell, there would exist an unbiased estimator 
of the form f(Y2), which has finite variance since Var(f(Y2)) < Var(Z) < oo. But Y2 
is complete, so there is at most one unbiased estimate 6 that is a function of Y2. Since 
iL (B)/Y2 is an unbiased estimate of 0, it follows that f(Y2) = uz(B)/Y2 as., but this 
is impossible since 1/ Y2 has infinite variance. Thus no unbiased finite variance estimator 
exists. Var d*(Y,,) < oo if and only ifn > 3. 


Solved also by D. Callan, D. A. Darling, E. Hertz, C. Peters, G. S. Rogers, E. A. Weinstein, GCHQ Problem Solving 
Group (U. K.), and the proposer. 


Generalizing “Every Even Number Is The Sum of Two Odds” 


10338 [1993, 873]. Proposed by Charles Vanden Eynden, Illinois State University, Normal, 
IL. Given an integer n > 1, determine the set of integers which can be written as a sum of 
two integers relatively prime to n. 


Solution I by Kevin Ford, University of Texas, Austin, TX. When n is even, the desired set is 
all even integers; when n is odd, it is all integers. 

When n is even, the summands must be odd, and the sum of two odd numbers is even. 
When m and n are not both even, we provide a realization of m. Let p),..., px be the 
primes dividing n. For each i, let bj be a number not congruent to 0 or to m modulo p;. By 
the Chinese Remainder Theorem, there is a number h/ such that h = b; (mod pj) for all i. 
It follows that both h and m — h are relatively prime to n. 


Solution II by Nasha Komanda, Central Michigan University, Mt. Pleasant, MI. With n 
fixed, for any integer k let q(k) be the product of all primes that divide n but not k (take 
q(k) = 1 if there are no such primes). Observe that k + q(k) and k — q(k) are relatively 
prime ton. As in Solution I, we must realize m when m and n are not both even. If m andn 
are odd, we use (m +q(m)) / 2and(m — q(m)) / 2. Ifmis even, we use m/2+q(m/2) 
and m/2 — q(m/2). 


Editorial comment. Most solvers used the Chinese Remainder Theorem. Sydney Bulman- 
Fleming observed that the answer would be different if m were required to be the sum of two 
positive integers. With this interpretation, there would be infinitely many counterexamples 
(e.g., (m,n) = (4, 6) or (m,n) = (7, 15)) to the result shown here. Frank Schmidt noted 
that a solution can be obtained using problem 49 in W. Sierpinski, 250 Problems in Number 
Theory, Elsevier, U.S. edition 1970, which reads: “Prove that for every positive integer m 
every even integer 2k can be represented as a difference of two positive integers relatively 
prime to m.” 

Solved also by P. J. Anderson (Canada), R. Barbara (Lebanon), P. Budney, S. Bulman-Fleming (Canada), G. Ehrlich, 


C. Lanski, J. H. Lindsey II, O. P. Lossers (The Netherlands), A. Pedersen (Denmark), R. M. Robinson, F. Schmidt, 
A. Tissier (France), A. N. ’t Woord (The Netherlands), NSA Problems Group, and the proposer. 


Disjoint Connections 


10341 [1993, 874]. Proposed by George Cain and Zhiging Lu, Georgia Institute of Tech- 
nology, Atlanta, GA. Let D = { (x, y): x74 y? < 1} be the unit disk in the plane, and 
let {A,, A2,..., An} be a pairwise disjoint collection of finite subsets of the set C = 
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{(x, yyixv+y= 1}. Prove that there is a pairwise disjoint collection {K,, K2,..., Ky} 
of connected subsets of D such that A; C K; foreachi = 1,2,...,n. 


Solution I by Eugene Curtin, Southwest Texas State University, San Marcos, TX. We prove 
the result without assuming the sets A; are finite, or even that there are finitely many such 
sets. For each 6 € [0, 277) let 


Bo = { (ros (9+ 74),rsin(+7)):0<r <1]. 


The sets { Bg : 8 € [0, 277) } are pairwise disjoint, and each Bg has the entire unit circle 
contained in its limit set. If{ Ay, : v € V } is any family of pairwise disjoint nonempty subsets 
of the unit circle, then foreach v € V pick anangle 6(v) such that (cos 6(v), sin @(v) ) € Ay. 
Then let K, = Boy) UAy. Each K, is connected since it is the union of aconnected set with 
a subset of its limit points. Thus, { K, : v € V } is a family of connected pairwise disjoint 
subsets of the disk with A, C Ky, forall v € V. 


Solution II by Frank Schmidt, Arlington, VA. Given sets A; as in the original statement, we 
sketch a construction of sets K;. 

Step 1. Draw polygonal arcs connecting the points in each Aj, 1 <i <n. 

Step 2. Modify these arcs, if necessary. to achieve general position. That is, no more than 
two arcs should cross at any point. 

Step 3. Ifan arc joining points in A; meets an arc joining points in A; with j 4 i, remove the 
crossing as in R. J. MacG. Dawson, “Paradoxical connections”, this MONTHLY 96 (1989), 
31-33. 


Editorial comment. The most common solution was a spiral construction like Solution I. 
Other solutions used comb spaces or the graph of sin(1/x). Leroy F. Meyers noted the 
connection with MONTHLY Problems E1515 [1962, 312; 1963, 95] and 10328 [1993, 689; 
96, Nov]. 

Solved also by J. W. Grossman, R. Holzsager, S. N. Kass, U. Klein (Germany), L. F Meyers, A. Miilller (Switzerland), 
S. Ott, G. Poor & R. Griffus, T. Richmond & B. Richmond, A. Riese, H. Schlais, A. W. Schurle, W. R. Smythe, S. T. 


Stefanov (Bulgaria), A. N. ’t Woord (The Netherlands), the New Mexico Tech Problem Solving Group, NSA Problems 
Group, The Citadel Problem Solving Group, and the proposers. 


Emergence of an Abelian Group 


10342 [1993, 874]. Proposed by Shmuel Rosset, Tel Aviv University, Ramat Aviv, Israel. Let 
F be a free group, and let R be a normal subgroup of F. Consider the subgroups [R, nF] 
defined by 

R ifn = 0, 
[R, nk = | [[R, (2 -1)F],F] ifn>0. 


Prove that the set of elements of finite order in R/[R, nF] is an abelian group. 


Solution by Stephen M. Gagola, Jr, Kent State University, Kent, OH. Let T,, denote the 
inverse image in R of the set of elements of finite order in R/[R, nF], so 7,41 © T,. The 
problem asks for a proof that 7,,/[R, nF] is an abelian subgroup of R/[R, nF]. We show 
that 7,,/[R, (1 +1)F]is acentral subgroup of R/[R, (n+ 1)F] and hence ts abelian. Since 
T, /[R, nF] is ahomomorphic image of T,,/[R, (n + 1) F], the desired result follows. 

Let { F,,} denote the terms of the lower central series of F', defined recursively by F; = F 
and F,+; = [F,, F] forn > 1. The only property of free groups used here is that F’/F,, 
is torsion-free for all n. Since [R,nF] C [F,nF] = F,4, and F/F,+, is torsion-free, 
it follows that J, C RO F,41, so[R, T,] © (R, Pn41]. To prove that 7,/[R, (n+ 1)F] 
is central in R/[R, (n + 1)F] (and hence is an abelian group), it suffices to prove that 
[R, Fn4i1] CUR, a+ 1 FI. 
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We prove by induction on n that [R, F,] © [R,nF] for every n and every normal 
subgroup R. Equality holds when n = 1. Forn > 1, we inductively assume that 
[N, Fn—1] © [N, (n—1)F] for every normal subgroup N. By the Three Subgroups Lemma, 


[(Fn-1, FI, R] S[LR, Fn, F]-[ CF, 1, Fr J. 


But 
[(R, Fri], F] C[[R, (2 - 1) FI), F ] = [R, nF] 


and 
[(F, R), Fn-1] =[08, F), Fai | C[(R, F), @ — 1) F | =([R, nF). 


Hence [R, F,] = [(Fn-1, F], R| G(R, nF. 


Solved also by A. M. Gaglione & D. Spellman and the proposer. 


Semi-unfriendly Sets 


10343 [1993, 874]. Proposed by David M. Bloom, Brooklyn College, CUNY, Brooklyn, NY. 
Let us call a subset of Z semi-unfriendly (abbreviated S-U) if it contains no three consecutive 
integers. Let E,, denote the n element set {1, 2, ..., n} and let 


A(n,k) =#{S CE, : #S =k, Sis S-U} 
Bin,k) =#{S CE, : #S =k, Sis S-U and E, — S is S-U}. 
Prove that B(3n — 1,n) = A(n + 3, 3) forall n > 1. 


Solution I by the late Raphael M. Robinson. A(n+3, 3) is the number of 3-element subsets of 
En+3 that are not composed of three consecutive integers. Since there are n + 1 consecutive 
triples, 

n+3 


A(n+3,3) = ( 4 


} —(n+1)=n(n+ I)(n+5)/6. 

To compute B(3n — 1, n), consider sequences 0 = xp < x1 <--> <X,_ < Xn41 = 3n Such 
that xj4; — xj < 3 forO <i <n. The condition requires that E3,, — {x,,x2,...,Xn41} 
contains no consecutive triple. Since the sum of the n + 1 differences is 3n, the differences 
must all equal 3, except for three differences, each 2, or two differences, one 1 and one 
2. Hence x1,...,Xn+1 also contains no consecutive triple. Choosing the locations for the 
differences other than 3 yields 


n+1 


B(3n ~ 1,n) = ( 4 


) +(n+1)n=n(n+1)(n+5)/6. 


Thus A(n + 3,n) = B(3n — 1,n). 


Composite solution II by Richard Holzsager, American University, Washington, DC and the 
proposer. Encode aset T C E, by aString o(T) = uyu2...U_Un41, where upz4; = O and 
otherwise uj = 1ifi € T and u; = Oifi ¢ T. Then T is S-U if and only if o(T) can be 
decomposed into substrings of the form 0, 10, and 110 (such a decomposition is unique). If 
a(T) uses a, b,c of the three types of substrings, respectively, then n + 1 = a+2b+ 3c 
and #(7) = b + 2c. If we change each 0 to 110 and each 110 to 0 in the decomposition of 
a (T), we obtain 0 (T’) for a S-U set T’ of size 2a + b = 2(n + 1) — 34(T). Furthermore, 
the length of o(T’) is 3a +2b+c = 3(n+ 1) — 44(T). This bijection proves that 
A(n,k) = A(3n — 4k + 2, 2n — 3k + 2). In particular, A(n + 3, 3) = A(3n — 1, 2n — 1). 

By complementation, B(3n — 1,n) = Bn — 1,2n — 1). As argued in Solution I, 
the complement of every semi-unfriendly 2n — 1-element subset of E3,—1 is also semi- 
unfriendly, so B(3n — 1, 2n — 1) = A(3n —1, 2n — 1). Hence B(3n — 1, n) = A(n +3, 3). 
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Editorial comment. Charles Lanski and Uday S. Gandbhir gave a more direct bijection from 
the set counted by A(n + 3, 3) to the set counted by B(3n — 1,n): If a set is counted by 


A(n + 3, 3), let its complement in E,43 be {x],...,x,}, where x} < --- < x,. Then 
B(3n — 1,n) counts the sets {4 — x1,8 — x2,...,4n — x,}, and the correspondence is 
bijective. 


J. C. Binz generalized the result to m-unfriendly sets, which contain no consecutive 
m-tuple. Letting A»(n,k) and By,(n, k) count the m-unfriendly k-subsets of E,, and the 
m-unfriendly k-subsets of E, whose complements are also m-unfriendly, he proved that 
Bimn — 1,n) = (™*") ~(n+1)=A(n+m,™m). 


Solved also by R. Barbara (Lebanon), D. Beckwith, J. Boutillon (France), S. Byrd, D. Callan, U. S. Gandbhir (Switzer- 
land), D. S. Gunderson, R. D. Hurwitz, K. S. Kedlaya, N. Komanda, C. Lanski, G. Lord, O. P. Lossers (The Netherlands), 
A. Pedersen (Denmark), D. Wolfe, A. N. ’t Woord (The Netherlands), Anchorage Math Solutions Group, and the NSA 
Problems Group (two solutions). 


Cubic Polynomials from Curious Sums 


10346 [1993, 951]. Proposed by David Doster, Choate Rosemary Hall, Wallingford, CT. 
Prove that, for all primes p, 


+e _ P=DP=~VYTD. 


D m (A) 


k=1 
and 


M —_ —_ — 
3° yep] = G2= De DP—Y. B) 


k=] 4 
where M = (p — 1)(p — 2). 


Solution to (A) by Manjul Bhargava (student), Harvard University, Cambridge, MA. For 
1<k < p—1,wehave k? 40 (mod p) and (p — k)? = —(k?) (mod p), and therefore 


(EE) (52 -[52)) 


Hence, 


l 
1 (pew p-1_ (p-~2)(p-I)pt)) 
Z 4 


Solution to (B) by Ed Shapiro, Hanover, NH, aiid Lou Shapiro, Howard University, Wash- 
ington, DC. Consider the set of lattice points $ = {(n,m):1<n<M,1<m< p-—-1}. 
Since p does not divide m, we have m> ~ nn for all (n,m) € S, and hence the curve 
y = 3/xp contains no point of S. Now the sum )-;, | 2/kp| counts the points of S 


below, the curve while po |k?/p| counts the points of S to the left of the curve. Since 
#(S) = (p — 2)(p — 1)?, we use (A) to obtain 


M 
3 a 2. Pa~ADP-~VDP+t) _— P-2)(p—- Y)GP—S) 
Do | VP | = (= 2009 = 9? = 


Editorial comment. The selected solutions prove the result, observed by many solvers, that 
the formulas hold whenever p is square-free. More generally, for any integer p, let a be the 


number of positive integers less than p whose cube is divisible by p. Some readers showed 
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that (A) holds when p is any integer if the right side is augmented by a/2. Since this is 
half the number of lattice points on the curve y = 3/xp appearing in the selected solution 
to (B), the right side of (B) must also be increased by this amount. Other readers retained 
the assumption that p be prime and investigated the effect of replacing the exponent 3 by 
an arbitrary odd integer r. These generalizations were combined by K. S. Williams, who 
gave formulas involving Bernoulli numbers for the general sum of this type. 

Solved also by A. Adelberg, J. Alvarez (Spain), J. Anglesio (France), R. Bagby, R. Barbara (Lebanon), A. Bergman, 
K. L. Bernstein, J. C. Binz (Switzerland), R. J. Chapman (U. K.), W. Chu (China), D. A. Darling, C. A. DeCarlucci, P. L. 
Douillet (France), J. S. Frame, S. M. Gagola Jr., U. S. Gandbhir (Switzerland), M. Getz, J. Greene, R. Holzsager, K. S. 
Kedlaya, M. J. Knight, H. K. Krishnapriyan, K.-W. Lau (Hong Kong), O. P. Lossers (The Netherlands), D. E. Manes, 
A. Nijenhuis, A. Pedersen (Denmark), R. M. Robinson, A. J. Rosenthal & D. E. Arias, K. Schilling, F Schmidt, C. Schoen, 
M. Vowe (Switzerland), H. Widmer (Switzerland), K. S. Williams (Canada), A. N. ’t Woord (The Netherlands), NSA 
Problems Group, and the proposer. 


A 4-Angle Criterion for Concurrence 


10348 [1993, 952]. Proposed by Jiang Huanxin (student), FuDan University, ShangHai, 
China. Let D, E, F be distinct points on the sides BC, CA, and AB respectively of AABC. 
Leta = ZBDF, B = LFDA, y = ZADE, and 6 = LEDC. If AD, BE, and CF are 
concurrent and a/B = 6/y =m (m $ 1), prove thata = 6 and B = y. 


Solution by Nasha Komanda, Central Michigan University, Mt. Pleasant, MI. Use the pro- 
portions 
Area(ABDF)  |BF| _ |B D| sina 
Area(AADF) |AF|  |ADJ|sin£B 
and , 
Area(ACDE) |CE| _ |CD{siné 
Area(AADE) |AE|  |AD\|siny 
to obtain 
|BF|-|AE| | |BD|sinasiny 
|AF|-|CE|  |CD|sin B sind 
By Ceva’s Theorem, |BF'|-|AE|-|CD| = |AF|-|CE|-|BD|. Therefore, 
sina sind 
—> ==. (*) 
snB  siny 
If in addition a/B = 6/y =m (m $# 1), then f(B) = f(y) where f(x) = sin(mx)/sinx. 
The equality 8 = y (which implies a = 4) follows if we prove that f is strictly monotone 
on (0,a) wherea = 7 ifm <1landa=az/mifm > 1. 
Since f’(x) = g(x)/ sin* x with g(x) = mcos(mx) sinx — cos x sin(mx), it suffices to 
show that g(x) does not change sign on (0, a). However, g’(x) = (1 _ m”) sin(mx) sin x 
and g(0) = 0. Thus, g(x) > 0 on (0, a) if m < 1 and g(x) < 0on (0, a) ifm > 1. 


Editorial comment. The first part of the proof shows that formula (*) is equivalent to AD, 
BE, and C F being concurrent when the “sides” of the triangle are taken to be line segments 
rather than extended lines. Other applications of a general concurrence criterion based on 
(*) would be interesting. 

The interval used in the second part of the proof is long enough to cover all intended 
geometric interpretations of the angles a, B, y, and 6. 

For m # 1, the conditions of the problem force the segment AD to be an altitude of 
AABC. Conversely, if AD is an altitude, (*) reduces to tana = tan d, which is equivalent 
to a = 6. On the other hand, if AD is not an altitude, the general example of concurrent 
lines does not have a/B = 6/y; but, for any D, examples with m = | are easily constructed. 
Since (*) is then satisfied, this gives an example in which AD, BE and C F are concurrent. 


Solved also by R. Barbara (Lebanon), R. J. Chapman (U. K.), A. Coffman, H. W. Guggenheimer, K. S. Kedlaya, Y.-H. 
Kiem (Korea), O. P. Lossers (The Netherlands), A. Nijenhuis, M. Vowe (Switzerland), and the proposer. 
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A Lemma of Dickson 


10350 [1993, 952]. Proposed by Borislav Lazarov, Sofia, Bulgaria. Let M be a set of 
positive integers. Let Py be the set of all primes that divide elements of M, and let Ly be 
the set of elements of M having no proper divisor in M. Show that Py finite implies Ly 
finite. 


Solution by the late Raphael M. Robinson. Let Py = {Pp}, p2,..-, Pn}; use induction on 
n. If n = O, then M is 9 or {1}, and Ly = M. If n = 1, then |Ly| = 1. Suppose 
n > 1, and assume the result for | Py| < n. Choose a fixed element P} py pi" of Ly. 
If p}' p> --- pn" is another element of Ly, then sg < ry, for some k. We need show only 
that the set of elements of Ly satisfying any one of these inequalities is finite. There are 
r; choices for s,, so it is sufficient to show that the set of elements of Ly with sy, fixed is 
finite. Let M’ be the set of numbers x prime to px such that Py x EM. Then |Py'| < n, 
so Ly: is finite. Hence the set of elements of Ly with the prescribed sy, is also finite, since 
it is a subset of the set p,‘ Ly: obtained by multiplying elements of Ly: by pi. 


Editorial comment. Paul Erdos noted that the assertion of this problem appeared as Lemma B 
(an immediate corollary of Lemma A) in L. E. Dickson, Finiteness of the odd perfect and 
primitive abundant numbers with n distinct prime factors, Amer. J. Math. 35 (1913), 413- 
422 (or Collected Mathematical Papers of Leonard Eugene Dickson, Chelsea, 1975, Vol. 1, 
349-358). Dickson gave two proofs of his Lemma A, one by induction along the lines of 
the selected solution and one by using the Hilbert Basis Theorem. 

A generalization of Dickson’s Lemma appeared as MONTHLY problem 4358 [1949, 480; 
1952, 255]. While Problem 4358 contains the assertion of the present problem, its solution 
is Somewhat more complicated. 

Solved also by A. Adelberg, R. Barbara (Lebanon), D. Beckwith, K. L. Bernstein, M. Bhargava, P. Budney, D. Caccia, 
R. J. Chapman (U. K.), M. Dawes (Canada), P. Erdés (Hungary), K. Fabian & V. Sandor (Germany), S. M. Gagola Jr., 
R. Holzsager, K. S. Kedlaya, H. K. Krishnapriyan, O. P. Lossers (The Netherlands), R. MacDonald, A. Nijenhuis, 


V. Pambuccian, A. Pedersen (Denmark), A. Riese, K. Schilling, F. Schmidt, J. Simpson (Australia), M. Woltermann, 
A. N. ’t Woord (The Netherlands), X. Xarles (Spain), NSA Problems Group, and the proposer. 


The Ratio of Volume to Surface Area 


10352 [1993, 952]. Proposed by Yves Nievergelt, Eastern Washington University, Cheney, 
WA. Let U be an open subset of R” with smooth boundary dU contained in a ball of 
radius R. (a) For n = 3, show that Vol(U) < R- Area(dU)/3. (b) Generalize to arbitrary 
dimensions n. 


Solution I by Richard Holzsager, The American University, Washington, DC. By the diver- 
gence theorem, if F is a vector field, then the integral over 0U of the dot product of F with 
the outward normal v is equal to the integral of V-F over U. Placing the origin at the center 
of the given ball and taking F to be the radial vector field F(p) = p, we get |F-v| < R 
and V-F = 3, giving the result. 

The divergence theorem generalizes to any number of dimensions. Repeating this argu- 
ment in n dimensions, we get n Vol(U) < R Area(dU) since V-F =n. 


Solution II by Paul Sisson, Louisiana State University, Shreveport, LA. Let B? be the 
ball of radius r and S”—! the sphere of radius r in R”. The isoperimetric inequality is 
Vol(U)"-)/" < C, Area(@U) where C, = Vol(B”)-)/" /Area(S"—!). Thus 


Vol(U) < C, Vol(U)!/" Area(aU) 
<C, Vol (By, )!/" Area(aU ) 
n—| 1 
Vol(B,) = Vol(BR)n 
= WOO) * OEE" Area(aU ). 
Area(Sp ) 
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Since Vol(Bp) = (R/n) Area(S,'), we have Vol(U) < (R/n) Area(aU) in R". 


Editorial comment. Both Richard Holzsager and Erik I. Verriest noted that the sphere in 
solution II need not contain U. It suffices to assume only that its volume is greater than or 
equal to Vol(U). 

This problem is part of problem 6 of section 14 in H. Guggenheimer, Applicable Geom- 
etry, Krieger, 1977. 
Solved also by R. J. Chapman (U. K.), H. W. Guggenheimer, M. S. Klamkin (Canada), T. C. Lim, O. P. Lossers (The 


Netherlands), G. Marton (Hungary), T. A. Murdoch, A. Nijenhuis, F. Schmidt, P. Szeptycki, E. I. Verriest (France), and 
the proposer. 


Natural Linear Combinations 


10354 [1994, 75]. Proposed by Hassan Ali Shah Ali, Tehran, Iran. Determine the least 
natural number N such that, for all n > N, there exist natural numbers a,b withn = 


laV2 + bV3). - 


Solution I by O. P. Lossers, University of Technology, Eindhoven, The Netherlands. There 
are two cases depending on whether 0 is considered to be a natural number. We first treat 
the case in which 0 is allowed. For every k, the numbers |aV/2 + b./3| witha+b = k (and 
a,b > O) represent all integers in the range from [k/2| to [k/3], since /3 — /2 < 1. 
For k = 0,1, 2, this gives the intervals (0, 0], [1, 1], [2,3]. Also fork > 2, we have 
(k + 1)/2 — k./3 < 1. Hence every natural number is covered and N = 0 in this case. 

If 0 is not considered a natural number, a similar analysis shows that N = 3. 


Solution II by Albert Nijenhuis, University of Pennsylvania (Emeritus), Philadelphia, PA, 
and University of Washington, Seattle, WA. Assume that 0 is not a natural number. Then 
the minimal number representable in the form |a/2 + b/3] with natural numbers a, b is 
|./2 + /3 = 3]. Thus N > 3. 

Let n > 3 and set m = | (n — V3) /V2 |. Then 


V3 4+mV2 <n <V¥3+(m+1)v2, 


where equality cannot occur because of irrationality. If /3 + (m + 1)/2 < n +1, then 
a = m+1 and b = 1 gives a representation of n. Otherwise, /3 + (m+ 1)/2 > n+ 1, 
so that n — (/3 + m/2) < /2—-1. Since /2—1 < 2(73 — /2) < 1, it follows that 


V3 4+ mV2 <n, V3 +mV2 4+ 2(V3 — V2) = 3V3 + (m —2)V2 <n +1, 


so nis representable witha = m—2 andb = 3, provided thatm > 2. Thusalln > 3/2+V3 
are representable, giving N < 6. In addition 4 = [2/2 + V3] and 5 = [3V2+ /3], so 
N = 3. Note that these representations require only b = 1 or b = 3. 


Editorial comment. Most readers noted that this question has two answers depending on 
whether 0 is accepted as a natural number. Frank Schmidt noted that, for any natural number 
k, all sufficiently large integers have at least k representations in this form. Indeed, the 
number of representations of n is asymptotic to n /6. This follows from results in G. H. 
Hardy & J. E. Littlewood, Some problems of Diophantine approximation, Proc. London 
Math. Soc., (2) 20 (1922), 15-36 (or G. H. Hardy, Collected Papers, Vol. 1, Cambridge, 
1966, 136-158). Patrick Dale McCray used a method similar to Solution II and found that 
the only properties used were that ./2 is irrational and 0 < /3 — V2 < 1. 


Solved also by J. Alvarez (Spain), J. Anglesio (France), R. Barbara (Lebanon), K. L. Bernstein, R. E. Bernstein, S. Byrd, 
R. J. Chapman (U. K.), J. Christopher, D. A. Darling, Z. Franco, H. Gintis, R. Holzsager, K.-W. Lau (Hong Kong), 
J. H. Lindsey II, P. D. McCray, R. M. Robinson, F. Schmidt, M. Shemesh (Israel), P. Sisson, R. de la Vega (Colombia), 
M. Vowe (Switzerland), A. N. ’t Woord (The Netherlands), The lonasphere, NSA Problems Group, Westmont College 
Problem Solving Group, and the proposer. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, DePauw University, Greencastle, IN 46135 


Mathematics and Politics: Strategy, Voting, Power and Proof. By Alan D. Taylor, 
Springer-Verlag, 1995 


Reviewed by Samuel Merrill, II 


Mathematics has long been central in the pursuit of the physical sciences and of 
major significance in such fields as biology and economics. Yet its impact on 
political science—the early work of the Marquis de Condorcet in the 18th century, 
Duncan Black in the 1940’s, and others notwithstanding —has been slow in coming. 
In the past three decades, however, a mathematical way of thinking has come to 
play a respected role in the study of politics, and in many ways represents a cutting 
edge of the field. At the same time the mathematics of politics has entered 
elementary mathematics textbooks, as mathematicians see politics as a source of 
fascinating problems and paradoxes. 

Two of the important strands of this effort are (1) how to translate the 
preferences of individual members of an electorate over several alternatives into a 
coherent social preference—the theory of social choice—and (2) how to model 
conflict between two or more entities and determine appropriate strategies for the 
antagonists—game theory. Much of the raison d’étre of social choice stems from 
the simple paradox of voting. Suppose three voters have transitive preference 
ranking abc, bca, and cab, respectively, for three alternatives labeled a, b, and c. 
Then a is preferred by a majority of two to one over 5, b is similarly preferred 
over c, and in turn c over a, so that the principle of majority preference does not 
yield a transitive social ordering. 

Duncan Black [1] provided a sufficient condition to avoid this paradox, called 
single-peakedness of preferences, 1.e., the existence of an ordering of the alterna- 
tives such that a plot of any voter’s preferences has a single peak. This idea, along 
with the work of Hotelling and Downs, led to the additional structure of a spatial 
model of electoral competition in which each voter (and candidate) is assumed to 
have an ideal point in a finite-dimensional issue space. The voter’s utility for 
candidates is a declining function of distance from that ideal point (see, e.g., 
Enelow and Hinich [4]). If this model is one-dimensional (e.g., represents the 
familiar liberal /conservative scale), preferences are single-peaked and a transitive 
social order is defined. But with two or more dimensions (representing two or 
more issues), this coherence falls apart again. For example, in a two-dimensional 
model, place three candidates at the three roots of unity: c, = e7“'"/°, k = 1,2, 3, 
and three voters at the same positions slightly rotated, say: v, = e'"/°c,. Then c, is 
favored over c,, C, over c3, and c, over c,, each by two voters to one. 

The problems of transitivity of a social ordering parallel those of finding 
undominated candidate strategies, and in particular, equilibria. For a two-candi- 
date contest in a one-dimensional spatial model, the location of the median voter 
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is an undominated strategy for a candidate to assume. In fact, it constitutes an 
equilibrium for both candidates, because any unilateral deviation from this stance 
risks losing votes. But in two or more dimensions, only a point that is the 
intersection of all median hyperplanes would constitute such an equilibrium; the 
existence of such a point is highly unlikely. Hence we can expect instability if 
candidates compete on more than one issue, or indeed as long as any candidate 
running behind can introduce and exploit a new issue. For three candidates there’s 
no equilibrium at all even for one dimension; for four candidates in one dimen- 
sion, there are two pairs of equilibrium positions if the voters are uniformly 
distributed; for more candidates, the situation gets progressively more complex 
(see Cox [3)). 

Game theory—as a model of conflict—has played a significant role in eco- 
nomics and political science since the work of von Neumann and Morgenstern [10]. 
One of the chief rewards of applying the theory is to tease out rational behavior 
from seemingly irrational outcomes. For example, two-person, zero-sum games 
always have a solution, but in general that solution requires each player to choose 
probabilistically among two or more strategies. Thus, in repeated play, a player— 
e.g., a political candidate—may appear indecisive while following a rational 
strategy. Many conflicts—although often modeled as zero-sum games—may better 
be represented as variable-sum games, in which interests are not diametrically 
opposed, but rather all players can do better (or worse) simultaneously. 
A basic solution concept is a Nash equilibrium—a pair of strategies by the two 
players from which neither can benefit by a unilateral deviation. The choice of the 
median voter’s position described above is such a solution. 

Real conflicts, however,—whether they be between politicians or polities— 
usually involve a sequence of moves and counter-moves. This characteristic has led 
to the theory of moves (Brams [2]), under which players are assumed to alternate 
with one another in choosing strategies (move or stay put) until neither changes 
strategies. A player is assumed to move only if s(he) expects the final outcome 
would be better if both players play rationally. These assumptions often lead to 
strategies that appear more reasonable than those predicted by classic game 
theory. 

Alan Taylor’s objective in Mathematics and Politics: Strategy, Voting, Power and 
Proof is to make available some serious mathematical ideas and techniques arising 
from problems in political science and conflict resolution to a student audience 
with little or no mathematical background. His is a teaching book: he is ever aware 
of what his readers do not know but can be taught when needed in order to 
develop the main ideas. The book is full of side comments to shepherd those 
students who may have misleading thoughts back into the fold. 

This effective style is illustrated well in the chapters on voting power. Attempts 
to give disparate influence to voting units through weighted voting or other means 
are shown to apportion power in unexpected ways. In fact, power itself can be 
defined in several different ways. Cognizant of the beginning student, Taylor 
describes how a yes-no voting system can be specified by winning and losing 
subsets of voters called coalitions and in great detail explains the construction of 
the Shapley-Shubik and Banzhaf indices of voting power. But he also introduces— 
without recourse to unwieldy notation—a series of paradoxes: the new-member 
and donation paradoxes—under both of which the Banzhaf index suffers 
(Felsenthal and Machover [5])—and the bicameral paradox, which infects the 
Shapley-Shubik index (Felsenthal, Machover, and Zwicker [6]). The recent discov- 
ery of these paradoxes lets the student know that research in this area—although 
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accessible to the novice—is alive and active and that, furthermore, the jury is still 
out concerning the evaluation of indices. In fact, many of the more significant 
theorems portrayed in the book have been discovered only in the last ten years, 
often by the author himself in collaboration with his colleague, William Zwicker at 
Union College. 

For example, a yes-no voting system is weighted if and only if it is trade robust, 
i.e., if an arbitrary exchange of players among several winning coalitions leaves at 
least one of the coalitions winning. A less restrictive condition is swap-robustness, 
which applies only to one-for-one exchanges. The U.S. federal system (Congress 
and the President) is not even swap robust (a House member and a Senator cannot 
in general be swapped); the amendment procedure for the Canadian Constitution 
—though swap robust—is not trade robust. Hence neither can be specified as a 
simple weighted system (with weights for the players and a quota). But any yes-no 
voting system can be specified via a finite collection of weighted voting systems 
each with the same set of voters such that the common winning coalitions 
constitute the winning coalitions of the original system. The minimum number of 
such systems needed defines the dimension of the yes-no voting system. The U.S. 
federal system and the Canadian Constitution each have dimension two. Alterna- 
tively, we may express each yes-no voting system as a Cartesian product of n 
weighted voting systems, where 7 is, of course, the dimension. 

The same concern for the beginning student pervades the chapter on social 
choice (cf. Straffin’s Theory of Voting [8], and Merrill’s Making Multicandidate 
Elections More Democratic [7]). Attempts to extend the majoritarian principle to 
multicandidate elections lead to one paradox after another and eventually to 
Arrow’s impossibility theorem. Taylor gives a series of formal proofs of properties 
such as the Condorcet winner criterion and monotonicity for various voting 
systems, each followed by a brief synopsis of the proof, which makes memorable 
the crux of the argument. Although a minor point in a generally excellent book, 
this makes the formal proofs seem unnecessary—almost pedantic. Should they be 
included? The chapter ends with the proof of a new result: no social choice system 
for three or more alternatives can satisfy both independence of irrelevant alterna- 
tives and the Condorcet winner criterion. 

This thread is pursued later in the book—after the student has gained greater 
sophistication. Taylor proves May’s theorem that the only procedure for two 
alternatives that is anonymous (invariant under permutation of voters), neutral 
(invariant under permutation of candidates), and monotone that produces a single 
winner for an odd number, n, of voters is majority rule. If ties are allowed, a quota 
system is obtained where n/2 <q<n+1. He then proceeds to the classic 
impossibility theorem of Arrow that the only social welfare function for three or 
more candidates satisfying the Pareto condition, independence of irrelevant alter- 
natives, and monotonicity is a dictatorship. To wrap up, he proves theorems of 
Black and of Sen, each of which provide sufficient conditions on the coherence of 
preferences to guarantee a transitive social ordering. 

On the topic of game theory, Taylor shows the student how political 
conflict—whether personal or international—can often be modeled by variable- 
sum games such as Prisoners’ Dilemma or Chicken. Prisoners’ Dilemma abstracts 
the frustrations at all levels of society between individual striving and cooperative 
agreement. The game of Chicken can embody the idea of deterrence; Taylor uses 
it to model two interpretations of the Cuban missile crisis. The “dollar auction,” 
with which Taylor introduces his book, provides a rationale for the seemingly 
irrational behavior inherent in escalation. His treatment of applied game theory 
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could serve as an introduction to Straffin’s undergraduate text, Game Theory and 
Strategy [9]. 

Alan Taylor’s book is carefully crafted. He is ever aware of his audience, but 
relentlessly presses the beginning student to understand more and more ideas. The 
text is appropriate for bright, intellectually motivated, but mathematically un- 
trained, undergraduates, who are provided with the opportunity to experience a 
significant frontier of mathematics. 
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TELEGRAPHIC REVIEWS 


Edited by Arnold Ostebee 


with the assistance of the Mathematics Departments of 
Carleton, Macalester, and St. Olaf Colleges 


Telegraphic Reviews are designed to alert readers in a timely manner to new books 
appropriate to mathematics teaching and research. Special codes classify reviews by 
subject area and appropriate use: 


T : Textbook P : Professional Reading 
C : Computer Software L : Undergraduate Library 
13: Grade Level 


1-4: Semester 
** : Special Emphasis 
?? : Questionable 


S : Supplementary Reading 


Readers are advised that price information is subject to change. Selected books 
receive a second, more extensive review in the Monthly. 


Books submitted for review should be sent to Book Reviews Editor, American Mathe- 
matical Monthly, St. Olaf College, 1520 St. Olaf Avenue, Northfield, MN 55057-1098. 


Recreational Mathematics, S. 3-D Geo- 
metric Origami: Modular Polyhedra. Rona 
Gurkewitz, Bennett Arnstein. Dover, 1995, 
iv + 73 pp, $6.95 (P). [ISBN 0-486-28863-3] 
Step-by-step instructions and clear diagrams for 
constructing over 50 modular polyhedra-based 
models. JNC 


Education, P. Mathematical Education of En- 
gineers. Eds: L.R. Mustoe, S. Hibberd. Inst. 
of Math. & Its Applic. Conf. New Ser., No. 
57. Clarendon Pr, 1995, xxiii + 383 pp, $120. 
[ISBN 0-19-851191-4] Proceedings of a 1994 
conference at Loughborough University. 


History, P. A New Branch of Mathemat- 
ics. Hermann Grassmann. Transl: Lloyd 
C. Kannenberg. Open Court, 1995, xvi + 
555 pp, $32.95 (P). [ISBN 0-8126-9275- 
6] Translation of Grassmann’s “Die lineale 
Ausdehnungslehre,” “Geometric Analysis,” and 
other selected papers. LC 


History, P, L. Vita Mathematica: Historical 
Research and Integration with Teaching. Ed: 
Ronald Calinger. MAA, 1996, xii + 359 pp, 
$34.95 (P). [ISBN 0-88385-097-4] Papers 
on the-history of mathematics and its integra- 
tion with the teaching of mathematics. Top- 
ics range from historical surveys (for example, 
“The Combinatorics and Induction in Medieval 
Hebrew and Islamic Mathematics,” by Katz) 
to pedagogy (“History of Mathematics and the 
Teacher,” by Heiede). Valuable resource. LC 


History, P. Of Men and Numbers: The Story of 
the Great Mathematicians. Jane Muir. Dover, 
1996, 249 pp, $7.95 (P). [ISBN 0-486-28973- 
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7] Republication of the 1961 Dodd, Mead & 
Co. edition. 


History, P. Courant. Constance Reid. Coper- 
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nicus (Imprint: Springer-Verlag), 1996, 228 pp, 
$15 (P). [ISBN 0-387-94674-8] Paperback 
republication of the original 1970 edition (TR, 
June—July 1970; Extended Review, May 1971). 


Number Theory, P. Fundamentals of Number 
Theory. William J. LeVeque. Dover, 1996, 
vii + 280 pp, $8.95 (P). [ISBN 0-486-68906- 
9] Republication of the 1977 Addison-Wesley 
edition. 


Number Theory, T*(13-14: 1). Introduction 
to Number Theory. Peter D. Schumer. PWS, 
1996, xi + 287 pp. [ISBN 0-534-94626-7] A 
very well-written introduction to number theory 
with good examples and problems, and many 
historical asides. Includes material on factor- 
ization and primality testing, continued frac- 
tions, partition theory, and an introduction to 
analytic number theory. DB 


Group Theory, P. Groups, Difference Sets, 
and the Monster. Eds: K.T. Arasu, et al. Ohio 
St. Univ. Math. Res. Inst., V. 4. Walter de 
Gruyter, 1995, xiii+ 461 pp, DM 198. [ISBN 3- 
11-014791-2] Proceedings of a 1993 special 
research quarter at The Ohio State University. 


Algebra, P. Tight Closure and Its Applications. 
Craig Huneke. CBMS Reg. Conf. Ser. in Math., 


[January 


No. 88. AMS, 1996, ix + 137 pp, $29 (P). 
[ISBN 0-8218-0412-X] 


Algebra, P. Cogroups and Co-rings in Cat- 
egories of ASsociative Rings. George M. 
Bergman, Adam QO. Hausknecht. Math. Surv. 
& Mono., V. 45. AMS, 1996, ix + 388 pp, $79. 
[ISBN 0-8218-0495-2] 


Calculus, S(13). Discovering Calculus with the 
Graphing Calculator. Mary Margaret Shoaf- 
Grubbs. Wiley, 1996, xiv + 204 pp, $23.95 (P). 
[ISBN 0-471-00974-1] Designed to accom- 
pany a standard calculus text. A few intro- 
ductory pages are followed by a series of labs 
each of which includes descriptions of appli- 
cable calculator features. Illustrated by screen 
examples from the TI—-82 calculator. JNC 


Calculus, $*(13). CalcLabs with Mathemat- 
ica. Nancy R. Blachman, et al. Brooks/Cole, 
1996, xvi + 245 pp, $20.25 (P). [ISBN 0- 
534-34086-5]; CalcLabs with Maple V. Albert 
Boggess, et al. Brooks/Cole, 1995, xiii + 
229 pp, $20.95 (P). [ISBN 0-534-25590-6] 
Designed for a first-year calculus class with 
1 hour of lab per week. The first 13 or 14 
chapters contain user-friendly introductions to 
commonly-used commands as tools for solv- 
ing calculus problems, and conclude with ex- 
ercises. The last two chapters contain labs and 
longer student projects. The Mathematica ver- 
sion assumes Version 2.2 and use of the Note- 
book Front End; the Maple V Version assumes 
Release 3. JNC 


Complex Analysis, T(18), S. Entire and Mero- 
morphic Functions. Lee A. Rubel. Univer- 
sitext. Springer-Verlag, 1996, viii + 187 pp, 
$39 (P). [ISBN 0-387-94510-5] Begins with 
a clear and concise treatment of Nevalinna the- 
ory. Develops the Rubel-Taylor method of 
Fourier series analysis. Presents Pdlya’s the- 
ory of the Borel Transform and Buck’s the- 
ory of integer valued entire functions. The 
writing is sparse: lemma-theorem-corollary for- 
mat. Contains a small bibliography and no ex- 
ercises. TAV 


Partial Differential Equations, P. Partial Dif- 
ferential Equations of Mathematical Physics 
and Integral Equations. Ronald B. Guenther, 
John W. Lee. Dover, 1995, xii + 562 pp, 
$17.95 (P). [ISBN 0-486-68889-5] Repub- 
lication, with corrections, of the 1988 Prentice 
Hall edition. Includes a new section on “Solu- 
tions and Hints to Selected Problems.” 


Partial Differential Equations, P. A Practi- 
cal Guide to Pseudospectral Methods. Bengt 
Fornberg. Mono on Appl. & Computat. Math., 
V. 1. Cambridge Univ Pr, 1996, x + 231 pp, 
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$54.95. [ISBN 0-521-49582-2] Pseudospec- 
tral methods are important in several applica- 
tions areas (e.g., computational fluid dynamics, 
wave motion). Explains how, when, and why 
these methods work. AO 


Dynamical Systems, T(15-17), S, L. Oscil- 
lations in Planar Dynamic Systems. Ronald 
E. Mickens. Ser. on Adv. in Math. for Appl. 
Sci., V.37. World Scientific, 1996, xii+ 319 pp, 
$48. [ISBN 981-02-2292-0] Complete revi- 
sion of Introduction to Nonlinear Oscillations 
(TR, April 1982). New chapters discuss method 
of harmonic balance, and a general procedure 
for two coupled first-order differential equa- 
tions based on Hopf bifurcation theorem and 
averaging. Extensive bibliography. DH 


Numerical Analysis, T(16—17: 2), L. A First 
Course in the Numerical Analysis of Differ- 
ential Equations. Arieh Iserles. Texts in 
Appl. Math. Cambridge Univ Pr, 1996, xvii 
+ 378 pp, $27.95 (P); $74.95. [ISBN 0-521- 
55655-4; 0-521-55376-8] Written for mathe- 
matics (rather than engineering) students. Cov- 
ers the solution of ODEs by multistep and 
Runge-Kutta methods; finite difference and fi- 
nite element methods for the Poisson equa- 
tion; basic methods for parabolic and hyper- 
bolic PDEs. AO 


Numerical Analysis, P, L*. Numerical Meth- 
ods for Least Squares Problems. Ake Bjérck. 
SIAM, 1996, xvii + 48 pp, $47.50 (P). [ISBN 
0-89871-360-9] A comprehensive and up-to- 
date treatment that includes many recent de- 
velopments. In addition to basic methods, it 
covers methods for modified and generalized 
least squares problems, and direct and iterative 
methods for sparse problems. AO 


Numerical Analysis, P. Lectures on Finite 
Precision Computations. Francoise Chaitin- 
Chatelin, Valérie Frayssé. SIAM, 1996, xvi 
+ 235 pp, $44.50 (P). [ISBN 0-89871-358-7] 
Addresses how finite precision affects the con- 
vergence in practice of numerical methods that 
are known to converge theoretically. DH 


Functional Analysis, T(18), P. Linear Func- 
tional Equations: Operator Approach. Anatolij 
Antonevich. Trans]: Victor Muzafarov, Andrei 
Iacob. Oper. Theory: Adv. & Applic., V. 83. 
Birkhauser Boston, 1996, viii + 179 pp, $123. 
[ISBN 0-8176-2931-9] A unified approach to 
the investigation of a general class of functional 
equations based on the examination of func- 
tional operators and Banach algebras generated 
by them. Uses methods involving dynamical 
systems, operator algebras, and pseudodiffer- 
ential operators. SA 
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Functional Analysis, P. Elementary Func- 
tional Analysis. Georgi E. Shilov. Transl: 
Richard A. Silverman. Dover, 1996, vii + 
334 pp, $10.95 (P). [ISBN 0-486-68923-9] 
Republication, with corrections, of the 1974 
MIT Press edition (Volume 2 of Mathematical 
Analysis). 


Analysis, P. Elementary Real and Complex 
Analysis, Revised English Edition. Georgi E. 
Shilov. Transl. & Ed: Richard A. Silverman. 
Dover, 1996, xi + 516 pp, $12.95 (P). [ISBN 
0-486-68922-0] Republication, with correc- 
tions, of the 1974 MIT Press edition (Volume 1 
of Mathematical Analysis). 


Analysis, P, L. Padé Approximants, Sec- 
ond Edition. George A. Baker, Jr., Peter 
Graves-Morris. Ency. of Math. & Its Ap- 
plic., V. 59. Cambridge Univ Pr, 1996, xiv + 
746 pp, $110. [ISBN 0-521-45007-1]  Incor- 
porates many new results and a new chapter on 
multiseries approximants. (First Edition, TR, 
November 1982.) AO 


Analysis, P*, L**. The World According to 
Wavelets: The Story of a Mathematical Tech- 
nique in the Making. Barbara Burke Hubbard. 
AK Peters, 1996, xix + 264 pp, $34. [ISBN 1- 
56881-047-4}] An accessible and well-written 
book about wavelets for non-mathematicians. 
The first half recounts the development of this 
field of mathematics and contains (almost) no 
formulas. The second half (“Beyond Plain En- 
glish’’) is a collection of articles that provide an 
elementary introduction to wavelets. AO 


Analysis, P. Lecture Notes in Control and In- 
formation Sciences—213: General Hybrid Or- 
thogonal Functions and their Applications in 
Systems and Control. Amit Patra, Ganti Prasada 
Rao. Springer-Verlag, 1996, xx + 118 pp, 
$43 (P). [ISBN 0-540-76039-3] 


Analysis, P. Potential Theory and Degener- 
ate Partial Differential Operators. Ed: Marco 
Biroli. Kluwer Academic, 1995, 184 pp, $99. 
[ISBN 0-7923-3596-1] Proceedings of a 1994 
conference in Parma, Italy. Partially reprinted 
from Potential Analysis, V. 4 (1995). 


Analysis, T(15-17: 1), P. Linear Difference 
Equations with Discrete Transform Methods. 
Abdul J. Jerri. Math. & Its Applic., V. 363. 
Kluwer Academic, 1996, xxi + 439 pp, $199. 
[ISBN 0-7923-3940-1] Tools for studying and 
solving ordinary linear difference equations. 
Covers, in addition to traditional techniques, 
the use of discrete Fourier transforms for solv- 
ing boundary value problems. AO 


Algebraic Geometry, P. Abelian Functions: 
Abel’s Theorem and the Allied Theory of Theta 
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Functions. H.F. Baker. Cambridge Univ Pr, 
1995, xxxv + 684 pp, $39.95 (P). [ISBN 0- 
521-49877-5] 


Differential Geometry, T(18: 1), P. Basic 
Concepts of Synthetic Differential Geometry. 
René Lavendhomme. Texts in Math. Sci., V. 13. 
Kluwer Academic, 1996, xv + 320 pp, $159. 
[ISBN 0-7923-3941-X] Introduction to syn- 
thetic differential geometry, an approach to dif- 
ferential geometry that uses infinitesimal ele- 
ments (objects whose squares are 0) and intu- 
itionist logic. JO 


Geometry, S*, P**, L**. Beyond the Third Di- 
mension: Geometry, Computer Graphics, and 
Higher Dimensions. Thomas F. Banchoff. Sci- 
entific American Library, 1996, ix + 211 pp, 
$19.95 (P). [ISBN 0-7167-6015-0] Paper- 
back edition of the highly acclaimed volume by 
the guru of higher dimensions. New computer 
graphics illustrations enhance what was already 
a “visually rich and intellectually enriching” 
portrait of dimensions. (1990 hardcover edi- 
tion, TR, August-September 1990; Extended 
Review, August-September 1991.) JNC 


Topology, S*, P. Counterexamples in Topol- 
ogy. Lynn Arthur Steen, J. Arthur Seebach, Jr. 
Dover, 1995, xi + 244 pp, $8.95 (P). [ISBN 0- 
486-68735-X] Republication of the 1978 Sec- 
ond Edition originally published by Springer- 
Verlag (TR, January 1979). 


Topology, P. Lectures on Spaces of Nonposi- 
tive Curvature. Werner Ballmann. DMV Sem., 
Band 25. Birkhauser Boston, 1995, v + 112 pp, 
$32 (P). [ISBN 0-8176-5242-6] 


Optimization, T(18: 1), P. Modified La- 
grangians and Monotone Maps in Optimiza- 
tion. E.G. Golshtein, N.V. Tretyakov. Transl: 
N.V. Tretyakov. Ser. in Disc. Math. & Op- 
tim. Wiley, 1996, ix + 438 pp, $72.95. 
[ISBN 0-471-54821-9] Theory and applica- 
tions of modified Lagrangian functions. Fo- 
Cuses on traditional convex programming and 
monotone maps. Applications include numeri- 
cal algorithms for the general convex program- 
ming problem, decomposition, economic mod- 
eling, and nonconvex local constrained opti- 
mization. AO 


Game Theory, T(13-16: 1), P*, L. Fair Di- 
vision: From Cake-Cutting to Dispute Resolu- 
tion. Steven J. Brams, Alan D. Taylor. Cam- 
bridge Univ Pr, 1996, xiv + 272 pp, $54.95; 
$18.95 (P). [ISBN 0-521-55390-3; 0-521- 
55644-9] Presents criteria for fairness, the lat- 
est constructive procedures for a fair division 
(e.g., envy-free allocation) of goods, and real- 
life applications. DH 


[January 


Stochastic Processes, P. Discrete-Time 
Markov Control Processes: Basic Optimality 
Criteria. _Onésimo Hernandez-Lerma, Jean 
Bernard Lasserre. Applic. of Math., V. 30. 
Springer-Verlag, 1996, xiv + 216 pp, $54.95. 
[ISBN 0-387-94579-2] Markov control pro- 
cesses are standard tools in a wide variety of 
settings, from fish hatcheries to portfolio man- 
agement. This book looks at the theoretical un- 
derpinnings and gives a solid treatment of the 
theory. Extensive bibiolography. TAV 
Elementary Statistics, T(13: 2). The New 
Statistical Analysis of Data. T.W. Anderson, 
Jeremy D. Finn. Springer-Verlag, 1996, xxi + 
712 pp, $59.95. [ISBN 0-387-94619-5] Up- 
dated and somewhat more elementary version 
of Anderson and Sclove’s The Statistical Analy- 
sis of Data (First Edition, TR, December 1978). 
Changes from the 1986 Second Edition include 
additional chapters on descriptive measures and 
probability distributions, omission of the chap- 
ter on multiple regression, and achange toSPSS 
as the computer package of choice. RSK 


Mathematical Computing, C, L. The Math- 
ematica Book, Third Edition. Stephen Wol- 
fram. Cambridge Univ Pr, 1996, 1395 pp, 
$59.95; $44.95 (P). [ISBN 0-521-58889-8; 0- 
521-58888-X] User guide and reference man- 
ual for Mathematica 3.0, the most recent ver- 
sion of this software. (1988 Addison-Wesley 
edition, TR, October 1988; Extended Review, 
November 1989.) AO 


Mathematical Computing, P*. The Maple 
Handbook: Maple V Release 4. Darren Red- 
fern. Springer-Verlag, 1996, 495 pp, $29 (P). 
[ISBN 0-387-94538-5] Reference tool. Brief 
entries for each command are organized in sub- 
ject area categories (calculus, linear algebra, 
combinatorics, number theory, etc.). AO 


Computer Science, P, L. Practical UNIX 
and Internet Security, Second Edition. Simson 
Garfinkel, Gene Spafford. O’Reilly & Asso- 
ciates, 1996, xxix + 971 pp, $39.95 (P). [ISBN 
1-56592- 148-8] 

Applications (Fluid Mechanics), P. Navier- 
Stokes Equations and Related Nonlinear Prob- 
lems. Ed: A. Sequeira. Plenum Pr, 1995, ix 
+ 406 pp, $115. [ISBN 0-306-45118-2] Pro- 
ceedings of a 1994 conference in Funchal, Por- 
tugal. 

Applications (Fluid Mechanics), P. Annual 
Review of Fluid Mechanics, V. 28, 1996. Eds: 
John L. Lumley, Milton van Dyke, Helen L. 
Reed. Annual Reviews, 1996, x + 598 pp, $52. 
[ISBN 0-8243-0728-3] 


Applications (Fluid Mechanics), P. Com- 
putational Methods for Fluid Dynamics. J.H. 
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Ferziger, M. Peri¢. Springer-Verlag, 1996, xiv 
+ 356 pp, $49.50 (P). [ISBN 3-540-59434- 
5] An overview of commonly used methods 
including direct and large eddy simulation of 
turbulence, multigrid methods, parallel comput- 
ing, moving grids, and free surface flows. AO 


Applications (Physics), T(18), P.  Evolu- 
tion Processes and the Feynman-Kac Formula. 
Brian Jefferies. Math. & Its Applic., V. 353. 
Kluwer Academic, 1996, ix + 235 pp, $125. 
[ISBN 0-7923-3843-X] The evolution of a 
physical system can often be described in terms 
of a semi-group of linear operators. Observa- 
tions may be modelled by a spectral measure. 
A combination of these basic objects produces 
a family of operator-valued set functions, by 
which perturbations of the evolution are repre- 
sented as path integrals. Integration theory in 
vector spaces is acentral topic of this work. SA 


Applications (Physics), P. Angular Momen- 
tum in Quantum Mechanics. A.R. Edmonds. 
Landmarks in Physics. Princeton Univ Pr, 
1996, vili + 146 pp, $39.50 (P). [ISBN 0-691- 
07912-9] Republication of the 1974 corrected 
printing of the Second Edition. 


Applications (Physics), P. General Theory 
of Relativity. P.A.M. Dirac. Landmarks in 
Physics. Princeton Univ Pr, 1996, viii + 71 pp, 
$10.95 (P). [ISBN 0-691-01146-X] Republi- 
cation of the 1975 Wiley edition. 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
212: Formal Specification and Synthesis of Pro- 
cedural Controllers for Process Systems. Ar- 
turo Sanchez. Springer-Verlag, 1996, xxiv + 
221 pp, $54 (P). [ISBN 3-540-7602 1-0] 


Applications. Monitoring a Comprehensive 
Test Ban Treaty. Eds: Eystein S. Husebye, An- 
ton M. Dainty. NATO ASI Ser. E, V. 303. 
Kluwer Academic, 1996, xxv + 836 pp, $349. 
[ISBN 0-7923-381 1-1] 


Applications, P, L*. The Algorithmic Beauty 
of Plants. Przemyslaw Prusinkiewicz, Aris- 
tid Lindenmayer. Springer-Verlag, 1996, xu 
+ 228 pp, $29.95 (P). [ISBN 0-387-94676- 
4] Computer graphics techniques for model- 
ing plant development and plant shapes. Em- 
phasizes use of Lindenmayer systems. Many 
color plates. AO 


Reviewers 


SA: Steve Abbott, St. Olaf; DB: David Bressoud, 
Macalester; JNC: Judith N. Cederberg, St. Olaf; LC: Laura 
Chihara, St. Olaf; DH: Deanna Haunsperger, Carleton; 
RSK: Richard S. Kleber, St. Olaf; JO: Jeff Ondich, Car- 
leton; AO: Arnold Ostebee, St. Olaf; TAV: Theodore A. 
Vessey, St. Olaf. 
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BRUCE POURCIAU holds the usual degrees (B.A. from Brown, Ph.D. from UC San Diego 
under Hubert Halkin) and since 1976 has held the usual positions (assistant, associate, 
professor of mathematics) at Lawrence University, overlooking the serene Fox River in 
Appleton, Wisconsin. His mathematical interests include optimization theory, various blends 
of topology and analysis, the philosophy of mathematics, especially intuitionism, and the 
history of mathematics, especially Newton’s Principia. When he isn’t playing tennis, pho- 
tographing nature, listening to Mozart, or reading mysteries, you will find him involved with 
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son Sean, and daughter Laurel. 
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EDITOR’S ENDNOTES 


Every five years or so, one editorial cycle ends and another begins. More than a year ago, 
members of the Editorial Board and staff listed inside the front cover began to solicit, read, 
select, and edit the articles and features for this issue and its 49 successors. We were ably 
tutored and generously nurtured by John Ewing and his Editorial Board, who helped us 
master the traditions and mechanics of the vigorous and healthy enterprise they have passed 
on to us. 

We are fully committed to the MONTHLY’s historic mission of publishing high-quality 
exposition of mathematics in order to advance and serve the broad spectrum of collegiate 
mathematics. As authors, we know that good expository writing about deep ideas for a 
general mathematical audience can be hard work, but as readers and teachers we know that 
it is well worth the effort. We know that our efforts have been successful when an article is 
interesting enough that our nonspecialist colleagues are willing to read it in bed, talk about 
it at coffee hour, lift a tidbit from it to present in class, or recommend it to a bright 
undergraduate for independent reading. 

To ensure an interesting variety of articles in each issue, the Editorial Board is actively 
soliciting articles in history and biography, statistics, computer science, modern applied 
mathematics, and mathematics education. In the latter area, we are especially interested in 
submissions that: encourage communications between mathematicians and mathematics 
educators, and between the mathematics community and client disciplines; encourage 
mathematicians to reflect on their own teaching; and share applicable results from mathe- 
matics education research. Naturally, we expect the steady flow of articles from core 
mathematical areas to continue while we give extra encouragement to areas from which we 
now see only a trickle. 

What qualities distinguish the few article submissions that are published (53 last year) 
from the many hundreds that are not? Novelty of ideas is neither necessary for acceptance 
nor sufficient for rejection. Interesting articles that present well-known ideas are welcome at 
the MONTHLY. /nteresting articles that successfully present the fruits of current research to 
our broad audience of mathematical readers are welcome, too; papers addressed to experts 
at the frontier should be submitted to appropriate specialized journals. Award-winning 
MONTHLY articles have high expository quality, which means much more than grammar, 
punctuation, and spelling. Somehow they find an attractive way to invite the reader to begin, 
keep interest high with well-chosen examples and figures, illustrate key issues via artfully- 
chosen special cases, and reward the reader’s active engagement by informing, enriching, 
and even entertaining. Clarity of exposition and broad appeal are always more important in 
a Monthly article than originality of the material or generality of the results. 

There is no ideal length for a MONTHLY article, which could be quite short if that is 
appropriate to the material. However, an upper bound of at most 60 pages for articles in 
each issue means that every decision to publish a very long article is necessarily a decision to 
limit the variety of articles in an issue, and we are always reluctant to do that. 

MONTHLY Notes are not short articles. In addition to insisting that Notes be readily 
accessible to a broad mathematical audience, we expect them to be mathematical gems that 
demonstrate significant originality in results, proof, or viewpoint. 

And now a few words about technology. We are experimenting with new ways to 
communicate with readers and authors, and invite you to visit the MONTHLY’s section of 
MAA Online at http: //www.maa.org/. There you will find tables of contents of forthcom- 
ing issues, brief descriptive summaries of articles, and information for authors about 
submitting papers, providing TeX source files, and achieving clarity in mathematical writing. 

Like all the Editors of the 36 years of MONTHLY volumes in my bookcase, I encourage 
you to write me about anything concerning our journal that you think is important. My 
predecessors tell me to expect very little mail from readers, but perhaps the convenience of 
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email will stimulate a flow of comments and reactions that can beneficially inform the 
choices we make daily in the editorial office. 


Abe Shenitzer has suggested the following clarification of the lead paragraph on 
noncommutative ring theory on p. 418 of the May, 1996 issue (Vol. 103): 


Hamilton’s “physical” motivation was to define an algebra of triples that would do for rotations 
in 3-space what the complex numbers do for rotations in the plane. Having failed in this task, he 
turned to quadruples of reals and created the algebra of quaternions. The quaternions did, in 
fact, yield the required computing tool for rotations in 3-space. 


Gary White wrote the following after reading Gary Lawlor’s article on the brachis- 
tochrone last March [103 (1996) 242-249]: 


The phrase ‘“‘a marble rolling without friction,” which appears throughout the paper, is, at best, 
an oxymoron. A marble can roll only if friction is present...if there were no friction present 
then the marble would simply slide down any hill without rolling—friction provides the torque 
that causes the marble to rotate as it translates down the hill. Furthermore, if the angle of the 
tangent to the hill is too steep, then the frictional force is too weak to keep the marble from 
slipping initially, so that in large amplitude motion of a marble in a cycloid-shaped trough, one 
expects to have both rolling and sliding occur, at least on the steepest parts of the trough... I 
suspect that what the author means by “rolling without friction” is, in fact, “rolling without 
slipping” or, more precisely, “rolling with no non-conservative work being done.” The more 
realistic problem of sliding and rolling with friction goes by the name of “motion with 
non-holonomic constraints” in more advanced classical mechanics textbooks, and, frankly, is 
usually avoided by them. Frank Crawford has written an interesting article about Galileo’s 
confusion of this issue and some at-home rolling experiments; see Rolling and Slipping Down 
Galileo’s Inclined Plane: Rhythms of the Spheres, Amer. J. Physics 64 (5), May, 1996. 


David Fowler’s lead article in last January’s issue [103 (1996) 1-17] contained three 
figures that illustrate the behavior of the binomial coefficient function of two real variables 
“x choose y.” Unfortunately, the fine detail of these remarkable figures was largely lost in 
the printing process, though a much better version of the first figure is reproduced on the 
cover of the August-September issue in 1995 [102 (7) (1995)]. The author has prepared an 
insert sheet in which these figures are reproduced with high resolution. Postscript files in 
various forms (corregenda.ps = 4.2MB; .pdf = .74MB; .psZ = .79MB) are available by ftp 
from 


ftp.maths warwick .ac.uk /pub /papers /dhf 
or at 
http://www .maths.warwick.ac.uk/maths /papers /dhf html. 


For a printed copy, contact the author at University of Warwick. 

Harold Boas writes that the question of differentiability of the ruler function discussed in 
a Note by Richard Darst and Gerald Taylor in last May’s issue [103 (1996) 415-416] has a 
long history. Some previous MONTHLY papers that address this problem are: Gerald J. 
Porter, On the differentiability of a certain well-known function, 69 (1962) 142; G. A. 
Heuer, Functions continuous at the irrationals and discontinuous at the rationals, 72 (1965) 
370-373; J. E. Nymann, An application of Diophantine approximation, 76 (1969) 668-671; 
Alec Norton, Continued fractions and differentiability of functions, 95 (1988) 639-643. 


Roger A. Horn, Editor 
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Dedicated to Educational Excellence for More Than 40 Years 


Faculty Consultants for the Advanced Placement Reading 


This June more than 3,700 college faculty and Advanced Placement teachers will gather for one week to 
evaluate and score students’ essays at the annual AP Reading. 

Applications are now being accepted for faculty consultants at this reading. Participants exchange ideas 
and contribute suggestions about their discipline, their courses, and the AP Examinations. They are paid 
honoraria, provided with housing and meals, and reimbursed for travel expenses. The College Board’s Advanced 
Placement (AP®) Program gives high school students an opportunity to take college-level courses and 
appropriate exams in 18 disciplines. More than 3,400 colleges and universities worldwide offer credit or 
advanced standing to students based on their exam performance. 

Applications are now being accepted for faculty consultants in the following subject areas: 


e Art ¢ Economics ¢ Government and Politics e Physics 

e Biology e English e History e Psychology 

¢ Calculus e Environmental Science e International English Language e Spanish 

e Chemistry e French e Latin e Statistics 

e Computer Science e German ¢ Music Theory 

Applicants should currently be teaching or directing instruction for the AP course or the corresponding 
college course in these disciplines. 

To receive an application or to send one to a colleague, contact: 


Educational Testing Service 
Essay Reading Office, MS 23-D 
Princeton, NJ 06541 

e-mail: dcranstoun@ets.org 


Visit our web site and complete your application online! 
http://www.collegeboard.org/ap/html/faculty/invit001.html 


Educational Testing Service is an Equal Opportunity/Affirmative Action Employer and especially encourages minorities and women to apply 


Join us for a World Class 
Meeting in America’s Olympic City 


MAA Summer 
Mathfest 97 yyy 
™ = August 1-4, 1997 
Atlanta, Georgia 


For details, look up "Meetings" on 
MAA Online: http:/Wwww.maa.org 


THE MATHEMATICAL ASSOCIATION OF AMERICA 
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This book contains the best problems selected 
from over 25 years of the Problem of the Week 
at Macalester College. Readers will find here a 
collection of intriguing and thought provoking 
problems that will give students (high school or 
beyond), teachers, and university professors a 
chance to experience the pleasure of wrestling 
with some beautiful problems of elementary 
mathematics. 


Compare your sleuthing talents with those of 
Sherlock Holmes, who made a bad mistake 
regarding the first problem in the collection: 
Determine the direction of travel of a bicycle 
that has left its tracks in a patch of mud. The 
collection contains a variety of other unusual 
and interesting problems in geometry, algebra, 
combinatorics and number theory. For exam- 
ple, if a pizza is sliced into eight 45-degree 
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product 1 2°3 1000000? Or: Is a manufac- 
turer’s claim that a certain unusual combination 
lock allows thousands of combinations justified? 


Complete solutions to the 191 problems are 
included along with problem variations and 
topics for investigation. This collection will be 
especially valuable to teachers who are looking 
for stimulating ways to engage their students 
with the beauty and intrigue that can often be 
found in elementary mathematics. 
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years) on the theory of functions of a real variable. 
Earlier editions of this classic Carus Monograph cov- 
ered sets, metric spaces, continuous functions, and 
differentiable functions. The fourth edition adds sec- 
tions on measurable sets and functions, the Lebesgue 
and Stieltjes integrals, and applications. The book is 
accessible to readers with some mathematical sophis- 
tication and a background in calculus. It is suitable 
either for self-study or for supplemental reading in a 
course on advanced calculus or real analysis. 


Not intended as a systematic treatise, this book has 
more the character of a sequence of lectures on a 
variety of topics connected with real functions. 
Many of these topics are not commonly encountered 
in undergraduate textbooks: for example, the exis- 
tence of continuous everywhere-oscillating functions 
(via the Baire category theorem); two functions hav- 
ing equal derivatives, yet not differing by a constant; 
application of Stieltjes integration to the speed of 
convergence of infinite series. 
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efficient, accurate, and powerful math system— 
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Unrivaled Features... The Best Gets Better. 
The Power Edition provides the most flexible and 
user-friendly interface for comprehensive math 
packages. Exciting interactive facilities let you 
create dynamic electronic learning environ- 
ments and professional, technical presenta- 
tions. Maple V’s extensive set of powerful 
math functions has been greatly enhanced by 
offering more and improved functions to help 
you and your students explore and compre- 
hend the most difficult mathematical concepts. 
Maple V — The Power Edition... the new 
standard. 
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a new state of the art for computer-based 
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Waterloo Maple 


ADVANCING MATHEMATICS 


Waterloo Maple Inc. 
450 Phillip Street, Waterloo, ON, 
Canada N2L 5J2 
Phone: (519) 747-2373 
Fax: (519) 747-5284 
Sales: 1-800-267-6583 
info@maplesoft.com 


Maple and Maple V are registered trademarks of 
Waterloo Maple Inc. 


Roni | tetce 


ane a 


pes rr wi 
with M. FL) scone noemrrece_| 


beso al 


Over 100 Books are 
now available to help 
you realize the full 
power of Maple V. 


*, apie v [ Release « 4- idwerge mes} 


. . 4 
ee 


7” Soret ggat Piped PicolorstyleeRGB stylesPAT CHNOGRIO anes=800E D} 


The Divergence of a Vector Field 
Effect. ¢ Problem-sohing with Maple V: The Power Edition 


@ Define the Field 


Consider the vector funchon g =! coxa? sux ty?) Compute ts divergence 
oe bvg = iwergesg [x ov) 
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@ Visualize the Information 


We can uisuahre the mformaton by combing the vector field and the densty plot of the 
divergence 
oars = decstarle Oo Rye A 


3), bselasSoe al fl; 


“ive Ts 


> Posen c=" Wye’ 


tt 


| 


ASSN Steet 


( 
( 


« 


| 


. : Nee? // / 
: Nee / 


http:/www.maplesoft.com 


SPRINGER FOR APPLIED MATHEMATICS 


New Colunn fer 1997 — 
Mathematical Communities 


THE MATHEMATICAL 
INTELLIGENCER 


Editor-in-Chief: CHANDLER DAVIS, 
University of Toronto, Canada 
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some of the world’s most renowned and 
respected mathematicians. Since the very 
first issue, The Mathematical Intelligencer 
has covered the history of math and histo- 
ry-making math, including the many con- 
troversies that surround all facets of 
mathematics. 


This one-of-a-kind publication is written 
specifically for mathematicians. It gives new 
insight to old equations and offers purely 
mathematical entertainments that can be dis- 
cussed in the math libraries and lounges 
around the world. The Mathematical 
Intelligencer is not the material mathe- 
maticians have to read, but what they want 
to read for enjoyment. 
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and academia meet the worlds of journal- 
ism and politics: social organization, gov- 
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ing to policy decisions. By dealing with case 
studies and providing extensive documen- 
tation, Lang challenges some individuals 
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challenges us to reconsider the ways they 
exercise their official or professional respon- 
sibilities, and challenges us to form our own 
judgment. 
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The purpose of this book is to show what 
mathematics is about, how it is done, and 
what it is good for. The text presents eight 
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mathematical thought, as well as the diver- 
sity of mathematical ideas. Drawn from 
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they include: spirals in nature and in math- 
ematics, the modern topic of fractals and the 
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algebra meet and interact, modular arith- 
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Yueh-Gin Gung and Dr. Charles Y. Hu 
Award for Distinguished Service to 
Deborah Tepper Haimo 


Carole B. Lacampagne 


From her early days at Radcliffe, studying with Hassler Whitney, and later meeting her 
husband-to-be in a Harvard class taught by Saunders Mac Lane, Deborah Tepper Haimo 
has had a love affair with mathematics. Her dedication to the MAA began soon after Frank 
and Deborah Haimo’s marriage when they both joined the Association. It culminated in her 
becoming president of the MAA in 1992. She continues to be an active and influential 
member of the mathematics community. 

All presidents of the MAA are called upon to provide a heavy service effort. But 
Professor Haimo has gone beyond the normal presidential service by her reorganization of 
the cumbersome MAA committee structure, by her personal devotion to obtaining the 
recognition of outstanding teaching in each MAA Section of the country, by creating the 
national awards bearing the name of her late husband and herself, and by encouraging the 
participation of women in mathematics at every level and in the Association. These are 
tremendously valuable achievements, worthy of the Gung-Hu Award. 

Deborah Haimo recognized well before her presidency the need for reorganization of 
the MAA’s committee structure and chaired a committee that devised the Coordinating 
Council system currently being used. This new structure made order out of chaos, and it is 
difficult to imagine carrying on with the old structure, given the complexity of the MAA 
today. 

Professor Haimo’s dedication to excellence in teaching is clear in her own well-honed 
lessons and in her innovative, applications-oriented teacher enhancement program for high 
school teachers of mathematics. Again, with typical insight, she called attention to the fact 
that although the Association has always claimed to value good teaching, there was nothing 
in the awards structure of the organization to highlight this. Outstanding expository writing 
(the Chauvenet, Ford, Allendoerfer, Pdlya, Hasse, and Beckenbach awards) and service (the 
Gung-Hu Award, Certificates for Meritorious Service) are appreciated and rewarded, but 
there was no particular recognition for excellence in teaching. She proceeded to stir things 
up (quietly, as is her style) and the Association soon established sectional awards for 
teaching, along with three national awards. 
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Just to make sure that new teaching awards are likely to continue, she initiated a 
significant gift to the Association. This prompted the Board of Governors to name the 
national awards the Deborah and Franklin Tepper Haimo Awards for Distinguished 
College or University Teaching of Mathematics. These are now firmly established as the 
prestigious awards for teaching and the list of winners is indeed distinguished. 

Even prior to her presidency, Professor Haimo served the Association for many years on 
many important committees (the Committee on the Teaching of Undergraduate Mathemat- 
ics, the 1975 Nominating Committee, the Program Committee for the 1977 Meetings in St. 
Louis, the Committee on the Participation of Women in Mathematics, to name a few) and 
as a member-at-large of the Board of Governors (1974-76). In 1986-87 she was First Vice 
President and in 1988-1989 she served as chair of the Search Committee for an Executive 
Director. She is now chair of both the Nominating Committee and the Development 
Committee. 

As a woman mathematician, Professor Haimo has not only been a role model for female 
students, but has written and spoken energetically about the need to give young women the 
Opportunity to study mathematics to the limit of their abilities and interests by making sure 
that the climate in mathematics departments and elsewhere, is supportive, welcoming, and 
encouraging to them. At the same time, she has been insistent that female students, like 
their male counterparts, be challenged to achieve at the very highest mathematical levels, at 
every step from kindergarten through graduate school. 

Her undergraduate years were spent at Radcliffe College, which gave her its alumnae 
Recognition Award in 1993. She went on to earn a Ph.D. in classical analysis at Harvard 
University. Although she had no connection beforehand with Franklin and Marshall 
College, she received an honorary degree of Doctor of Science at a special spring convoca- 
tion in 1991. After faculty appointments at Washington University (St. Louis) and Southern 
Ulinois University at Edwardsville, she took a position at the University of Missouri-St. 
Louis where she served as chair of the Department for some years and eventually became 
Professor Emerita. She has held visiting appointments at the Technion in Israel and at the 
Institute for Advanced Study, Princeton, as a member in 1972-73 and later as a trustee of 
the Association of Members. 

Always interested in educational matters, aside from research supported by federal 
agencies, she received numerous grants for teacher education programs at the University of 
Missouri. She has served on numerous national and international panels and committees: 
mathematician for an Agency for International Development Science Team to evaluate 
graduate programs at Seoul National University (1974), the ETS College Level Examina- 
tions Program Committee (1986-89), the MAA/NCTM National Selection Committee for 
the Presidential Awards for Excellence in Science and Mathematics Teaching (1988), to 
name a few. 

At the same time, she has never abandoned her role as a research mathematician, having 
published over 45 papers in classical analysis, in particular, on generalizations of the heat 
equation, special functions, and harmonic analysis. She has served as an associate editor of 
the SIAM Journal on Mathematical Analysis and the American Mathematical Monthly. She 
now holds an appointment at the University of California, San Diego, where she lives right 
near the ocean that she dearly loves. 

Always active in educational matters outside mathematics as well as within, she served as 
a trustee of Radcliffe College between 1975 and 1981, and she has just completed her 
tenure as a member of the Board of Overseers of Harvard University. She is active in the 
American Association for the Advancement of Science and after serving on statewide 
committees in Missouri and Connecticut, since settling in California, she was recently 
appointed by the State Board of Education to the Mathematics Framework and Criteria 
Committee, a major assignment. Her contributions to mathematics and to education 
continue at a breathtaking pace. 

Professor Haimo is the mother of five very talented offspring and has nine grandchildren 
and one great grandchild. An active sportswoman, she is an enthusiastic tennis player. She 
has participated and won many medals in track events at Senior Olympics in Missouri and 
Illinois, and even won a gold medal for racewalking recently in San Diego. 


98 AWARD FOR DISTINGUISHED SERVICE TO DEBORAH TEPPER HAIMO [February 


Scheduling Conflict-free Parties 
for a Dating Service 


Bryan L. Shader and Chanyoung Lee Shader 


1. THE MORAL. We are interested in a scheduling problem that a dating service 
might confront. The problem is an excellent vehicle for tying together fundamental 
concepts in an elementary linear algebra course. The solution presented illustrates 
the beautiful and powerful relationship among graph theory, combinatorics, and 
linear algebra. Without linear algebra the problem seems intractable. Yet elemen- 
tary tools from linear algebra crack the problem. 

In addition to attracting students’ attention, the dating service problem rein- 
forces many of the concepts identified by the Linear Algebra Curriculum Study 
Group as essential in an elementary linear algebra course [3]. The concepts that 
arise in the problem include: the different views of matrix multiplication, block 
matrix multiplication, matrix factorizations, the importance of rank, the relation- 
ships between rank and the determinant, and properties of determinants. The 
moral is: “A little linear algebra goes a long way”. 


2. THE DATING SERVICE PROBLEM. A dating service has as clients n recently 
divorced couples, m, &w,, m,&w,,...,and m, &w,, where m, and w, are the 
man and woman, respectively, of the ith divorced couple. Each client wants the 
opportunity to socialize with the other clients, but refuses to be in the same room 
as his or her ex-spouse. Thus, the service cannot throw just one party and must 
arrange a sequence of parties. To avoid embarrassing situations, if m; and w, are 
invited to the same party, then i #/. In addition, in order to provide equal 
opportunity for each potential new couple, the dating service would like m; and w, 
to go to exactly one common party whenever i # j. For 5 divorced couples 


Party 1: mz,, Wy, W3, Wy, Ws 
Party 2: m,, W1,W3, Wy, Ws 
Party 3: m,, W1, Wa, Wy, Ws 
Party 4: m,, W1, W 2, W3, Ws 
Party 5: ma,, W1, Wa, W3, Wy 


is one possible sequence of parties, and 


Party 1: m,,m,, W3, Ws 
Party 2: m,, Ms, W,, Wy 
Party 3: m3, m,, Ws, W 
Party 4: m,, ms, W1,W3 
Party 5: ms, mj, W>, Wy 


is another. 
The dating service has approached its staff mathematician with the following 
questions: 


1. Is it possible to design such a sequence of parties? 
2. If so, (for economy’s sake) what is the fewest number of parties that can be 
held? 
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3. What are all the possible sequences that have the fewest number of parties? 

4. How should one handle more general situations (that is, when the clients 
aren’t n recently divorced couples but there are still some incompatible 
couples)? 


In the next section, we describe how the staff mathematician can formulate the 
general dating service problem as a problem in graph theory. In section 4, we use 
basic linear algebra to solve the problem for n divorced couples, and to discover 
some surprising consequences. For example, we will see that for n divorced 
couples the dating service must throw at least n parties, and if m — 1 is a prime 
number then there are just two different ways that the parties can be arranged. 


3. GRAPH THEORY. A bipartite graph B = B(M,W,C) consists of disjoint finite 
sets M and W, and a given subset C of M X W. The elements of MU W are 
called vertices and the elements of C are called edges. The bipartite graph 
B(M, W,C) can be described by a diagram. Each element of M U W is identified 
with a point, and the point corresponding to m € M is joined by a line segment to 
the point corresponding to w € W if and only if (m,w) € C. For example, if 
M = {m,,m,,m3,m4, ms}, W = {w,,W2,W3,W4,Ws}, and C = {((m,,w,)|l <i, j < 
5,i # j}, then the diagram of B(M, W, C) is: 

A biclique of B is an ordered pair (X,Y) where X is a subset of M, Y is a subset 
of W, and X XY is a subset of C. For example, if X = {m,,m,} and Y = 
{w3,W,, Ws}, then (X,Y) is a biclique of the graph illustrated in Figure 1. Two 
bicliques (X,, Y,) and (X,, Y,) of B are disjoint provided (X, X Y,) N (X, X Y,) 
= (. A biclique partition of B is a collection (X,,Y,), (X,Y>),...,(X,, Y,) of 
bicliques of B such that the bicliques (X,, Y,) and (X,, Y,) are disjoint for all i # j, 
and 


C=(X,X Y,) Us: U(X, UY,). 
For example, 
({m,, m)}, {w3,ws}), 
({m2, ms}, (wa, wi}), 
(1) ({ms, m4}, {ws,w}), (1) 
({m,, ms}, {w,, w3}), 


({ms, mi}, {W2, W4}) 


is a biclique partition of the graph in Figure 1. 


Figure 1 
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Since ({m}, {w}) is a biclique for each edge (m, w) € C, the collection of all such 
bicliques forms a biclique partition of B. Thus, B always has at least one biclique 
partition. Indeed, since {{m,}, {w,: (m;, w;) © C}} is a biclique for each i, and {{m;: 
(m,, w;) € C}, {w}} is a biclique for each j, B always has a biclique partition with 
min{|M|, |W |} bicliques, where |-| denotes the cardinality of a set. The biclique 
partition number, bp(B), of B is the fewest number of bicliques among all biclique 


partitions of B. The preceding observation gives the upper bound 

bp(B) < min{|M|, |W}. (2) 
A biclique partition of B that has exactly bp(B) bicliques is called an exact 
biclique partition. 

The staff mathematician can reformulate the general dating service problem in 
terms of bipartite graphs and biclique partitions as follows. View M as the set of 
gentleman clients, W as the set of lady clients, and C as the set of compatible pairs 
(i.e., (m,w) € C if and only if m and w are willing to attend the same party). The 
graph illustrated in Figure 1, is the bipartite graph in the case of 5 divorced 
couples. If a party is viewed as a subset of M U W, then a party with no conflicts 
corresponds to a biclique of B. Thus, a sequence of conflict-free parties where 
each compatible pair (m,w) attends exactly one common party corresponds to a 
biclique partition of G, and the fewest number of parties in such a sequence is the 
biclique partition number of B. For example, the biclique partition in (1) corre- 
sponds to the second sequence of parties described in Section 2. 

Therefore, since every bipartite graph has a biclique partition, the staff mathe- 
matician knows that it is always possible to design a sequence of conflict-free 
parties, and that the fewest number of parties needed is the biclique partition 
number of the corresponding graph. To solve the problem of 5 divorced couples 
for the dating service, the staff mathematician needs to find the biclique partition 
number of the graph in Figure 1, and then classify the exact biclique partitions of 
this graph. 

The following result (see [1]) determines the biclique partition number of the 
graph corresponding to n divorced couples and places severe restrictions on its 
exact biclique partitions. 


Theorem 3.1. For n an integer with n>=2, lett M={m,,m),...,m,}, W= 
{W1,W2,..., Wy}, C = {((m;,w,) |1 <i, j <n,i #j}, and B = B(M,W,C). Then 
bp( B) =n. 


Furthermore, if (X,, Y,),...,(X,, Y,) is an exact biclique partition of B, then there 
exist integers r and s such that 


a) rs =n —1, |X| 1, and |Y| =s for eachi = 1,2,...,n. 

b) Each element of M is in exactly r of the X, and each element of W is in exactly s 
of the Y,, and 

c) For i # j, exactly one element of X; x Y, is not in C. 


Theorem 3.1 shows that for n divorced couples the dating service must throw at 
least n parties. Moreover, for any sequence of m parties and for some integers r 
and s with rs =n — 1, we know that each party has r men and s women, each 
man is invited to exactly r parties, each woman is invited to exactly s parties, and 
for any two parties there is exactly one man who is invited to the first party and 
whose ex-wife is invited to the second party. The case that nm — 1 is prime is 
particularly interesting, since, if rs = n — 1, then either r =n —1,ors=n-—1.It 
follows that there are exactly two ways that the dating service can design a 
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sequence of n parties. Either for each man there is a party where he is the only 
invited man and all women (except his ex-wife) are invited, or for each woman 
there is a party where she is the only invited woman and all men (except her 
ex-husband) are invited. 

In the next section, we follow [1] and prove Theorem 3.1 using elementary linear 
algebra. 


4. LINEAR ALGEBRA. The problem of determining the biclique partition num- 
ber of a bipartite graph can be rephrased as a matrix problem. Throughout this 
section, let M = {m,,...,m,}, W = {w,,...,w,}, and let B be a bipartite graph 
BOM. W,C). The reduced adjacency matrix of B is the k-by-! matrix A = [a,,] with 

= 1 if (m,,w,) € C, and a;; = 0 otherwise. For example, the reduced adjacency 
matrix of the graph in Figure 4 is 


1111 
1 01 1 41 
1 1 01 41 
1 1 1 0 41 
1 1 1 1 +0 
For any subset R of M we let 
ry 
R=|" 
Vk 


where 7, = 1 if m, € R, and r, = 0 otherwise. Similarly, for any subset S of W we 
let 


where s, = 1 if w, © S, and s, = 0 otherwise. It is easy to verify that RS is the 
k-by-1 matrix with G,j) entry equal to 1 if (m,,w,) © R x S, and equal to 0 
otherwise. Thus, RS” is the reduced adjacency matrix of the bipartite graph 


M,W,}(m,,w,): m; © R and w, € S}). 
(71;,.W, 


Remark 4.0. Each entry of RS is less than or equal to the corresponding entry of 
A if and only if (R, S) is a biclique of B. 


Lemma 4.1. Let X,, X,,...,X, be subsets of M and let Y,,Y,,...,¥, be subsets of 
WV Then LX, YD .(X,, Yi is a biclique partition of B if and only if A = X,Y," 
+X, y,? 


Proof: First suppose that (X,, Y,),..., (X,, Y,) is a biclique partition of B. Let i 
and j be such that the (i, j) entry of A equals 1. Then there is an edge joining m, 
and w, in B, and this edge is contained in exactly one of the bicliques (X;, Y,),- 

(X,,, Y,), say (X . Y,). Thus the (i, j) entry of X,Y." equals 1, and the (i, p entry of 
X.Y. r equals 0 for r #5. It follows that the (i, /) entry of X,Y," + -  +X,Y,7 
equals 1. Now let i and j be such that the (i, j) entry of A equals 0. Then B does 
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not contain the edge joining m, and w, and there is no r such that m; € X, and 

w, & Y,. Thus the G, D entry ¢ of X. Y, r ‘equals 0 for r = 1,2,. oe P, and hence the 

(i, j) entry of X,Y," + - +X, Y, r equals 0. Therefore, A = X,Y," +X, Y, - 
Conversely suppose that 


A=XY,0 +--+ +X,Y," (3) 


Since each X.Y. " is a matrix of 0’s and 1’s, the entries of each X.Y. ” are less than 
or equal to the corresponding entries of A. It follows from Remark 4.0 that each 
(X., Y.) is a biclique of B. Consider any edge, say the edge joining m; and w,, of B. 
Since the (i, j) entry of A equals 1 and each X, Y, Y is a matrix of 0’s and 1’s, 
equation (3) implies that there is a unique s such that the (i, j) entry of X.Y." is 
nonzero. Hence, the edge joining m; and w, is contained in exactly one ‘of the 
bicliques (X,, Y,),...,(X,, Y,). Therefore, (X,, Y,),...,(X,, Y,) forms a biclique 


partition of B. a 


Remark 4.2. Let X,..., X, be subsets of M and let Y;,..., Y, be subsets of W. 
Let X be the k-by-p matrix whose ith column is X; and let Y be the p-by-/ matrix 
whose ith row is Y,’. Then by block multiplication of matrices we see that 


XY =X,Y,7+ X,Y,7 + +X,Y,". 
Thus, a biclique partition of B into p bicliques gives rise to the factorization 
A =XY, where X is a k-by-p (0,1)-matrix and Y is a p-by-/ (0, 1)-matrix. 
Conversely, it is easy to verify that if X is a k-by-p (0, 1)-matrix and Y is a p-by-/ 
(0, 1)-matrix with A = XY, then (X;, Y,), (X2, Y),...,(X,, ¥,) is a biclique parti- 
tion of B, where X, is the ith column of X and Y, T is the ith row of Y. 

In particular, it follows that the biclique partition number of B is the smallest p 
such that A = XY, where X is a k-by-p (0, 1)-matrix and Y is p-by-/ (0, 1)-matrix. 
Since p > rank(X) > rank(XY ), we have the lower bound 

bp(B) => rank( A). (4) 
While it is easy to compute the rank of A, it seems quite difficult to compute 
bp(B) for arbitrary bipartite graphs B (see [6)). 

There are bipartite graphs B for which bp(B) > rank(A). For example, let B 

be the bipartite graph whose reduced adjacency matrix is 


1 1 0 0 
,-|9 1.1 0 
0011 
1001 


Clearly, A has rank 3. It is easy to see that each biclique of B has at most 2 edges. 
Since B has 8 edges, this implies that bp(B) > 4. Therefore, the biclique partition 
number of B is greater than the rank of its reduced adjacency matrix. 


For the remainder of this section we specialize to the m divorced couples case 
with n > 2. Thus, M = {m,,m,,...,m,}, W = {w,,w2,...,w,} and C = {(m,, w,): 
i # j}. In this case A = J, — I, where J, is the n-by-n matrix of all 1’s and J, is 
the n-by-n identity matrix. It is easy to verify that (mn — 1)~'J, — I,)A =I,. Hence 
A is invertible, so rank( A) = n. The bounds (2) and (4) permit us to conclude that 


bp(B) =n. (5) 
The following lemma is part of the folklore of linear algebra. 
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Lemma 4.3. Let X be an n-by-k matrix and let Y be a k-by-n matrix. Then 


det(I, + XY) = det( J, + YX). 


Proof: Check that 


I, Y 
0 I, 


L, 0 
X 1,+XY 


xX L,|| 90 1 


n n 


I, + YX le Y 


and take determinants of both sides. Recall that the determinant of a product of 
Square matrices is the product of the determinants of the factors, and that the 
determinant of a block triangular matrix whose diagonal blocks are square is the 
product of the determinants of the diagonal blocks. | 


Lemma 4.4. Let X and Y be n-by-n (0, 1)-matrices with XY = J, — I, and n = 2. 
Then XY = YX, and there exist positive integers r,s with rs =n —1 such that 
XJ = JX = rJ and YJ = JY =sJ. 


Proof: Let X; denote the ith column of X, and let Y,’ denote the ith row of Y. 
Since 


0 = tr( XY) = t(YX) = LY,'X,, 
J=0 


we conclude that 
YX; = 0 for j =1,2,...,n. (6) 
Let e be the n-by-1 vector of all 1’s. Then J, = ee’. For i # j, we have 


I, + X,Y," + XY," =ee"— XY. (7) 


L#i,j 


The right-hand side of equation (7) is the sum of n — 1 matrices each of rank 1. 
Since the rank of a sum of matrices is at most the sum of the ranks of the matrices, 
we conclude that the n-by-n matrix J, + X,Y," + X,Y," has rank at most n — 1, 
and hence is not invertible. Thus 0 = det(/, + X,Y," + X;Y,"). Using this observa- 
tion and Lemma 4.3, we conclude that 


yl 
— Tr r = | 
= det( J, + X,Y, + X,Y; ) = a + [ X;, X;| y,7 
y,7 1 0 0 WX, 
= det} I, + yi! |X, X;| = det F Hl r YX; 0 


= 1 ~ (¥X)-(¥X)). 


J 
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Thus 1 = Y,"X;- Y,"X;. Since the entries of Y, and X; are 0’s and 1’s, this implies 
that 


YX; = 1fori # j. (8) 
Since, 
Y/ 
YX =| + |[X,...,X,] = [¥,7X], 
yl 


(6) and (8) imply that YX = J, — I, = XY. Since X commutes with Y, X, and J, it 
commutes with YX + I, which is J/,. A matrix commutes with J, if and only if its 
row sums and column sums are all equal, so XJ, = J, X = r/J, for some nonnega- 
tive integer r. Similarly, YJ, = J, Y = sJ,, for some nonnegative integer s. Because 
rsJ,, = (XY )J, = VJ, — 1 )J, =(n - DJ,, rs =n — 1, and the proof is complete. 

| 


We now have all the necessary ingredients to prove our main result, Theorem 
3.1. Let (X,Y), (X2, Y2),..., (X,, ¥,) be a biclique partition of B. By G), p =n. 
Let X be the n-by-p matrix whose “ith column is X,, and let Y be the p-by-n 
matrix whose ith row is Y,’. From Remark 4.2, XY =J, —J,. Thus, if p =n, 


Lemma 4.4 applies and statements a), b), and c) of Theorem 3.1 follow by 
appropriately interpreting the entries of XJ, YJ, JX, JY, and YX. 


5, EPILOGUE. We conclude with some further observations and a brief discus- 
sion of related problems. 

For any pair of positive integers r and s with rs = n — 1, there is a (0, 1)-matrix 
factorization XY =J, — J, with XJ, =J,X = rJ, and YI, =J,Y =sJ,. Namely, 
let X=Z+Z*+->- +Z' and Y = I, + ZZ pe +Z“~)" where Z is the 
n-by-n matrix with 1’s in positions (1, 2), (2, 3),...,(# — 1,7), and (n, 1). 

If XY =J, — I, is a (0, 1)-matrix factorization then so is XY = J, — I, where 
X= XP, Y= "PTY, and P is a permutation matrix. The conflict-free party scheme 
corresponding to that of XY = J, — I, can be obtained from that corresponding to 
XY =J, — I, by permuting the order of the parties according to the permutation 
P. Thus, one may consider these factorizations to be equivalent. As shown in [2], 
there are matrix factorizations of J, — J, that are not equivalent to any of those 
constructed in the, preceding paragraph. A complete characterization of the 
nonequivalent factorizations of J, — I,, (equivalently, the exact biclique partitions 
of the graph corresponding to n divorced couples) is not yet known. 

The study of biclique partitions arose first in the context of an addressing 
problem in computer science [4]. In this setting one needs to consider graphs other 
than bipartite graphs. 

A biclique B of a general graph G is a subgraph of G with the property that 
there exist disjoint sets X and Y of vertices of G such that the edges of B are 
precisely those edges joining a vertex in X and a vertex in Y. Just as in the case of 
bipartite graphs, the biclique partition number of G, bp(G), is the fewest number of 
bicliques of G whose edges partition the edges of G. If the vertices of G are 
1,2,...,m, then the adjacency matrix of G is the n-by-n matrix A whose (i, j) 
entry equals 1 if vertex i and vertex j are joined by an edge of G, and 0 otherwise. 
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The following elegant result of Graham and Pollak [4], whose proof also uses 
elementary linear algebra, relates the biclique partition number of a graph and the 
eigenvalues of its adjacency matrix. 


Theorem 5.1. Let G be a graph with adjacency matrix A. Then 
bp(G) => max{n, n_}, 


where n, and n_ denote the number of positive and negative eigenvalues of A, 
respectively. 


The adjacency matrix of the complete graph, K,, is J, — I,, and the eigenvalues 
of this matrix are n — 1, —1, —1,..., —1. Hence by the Graham-Pollak theorem, 
bp(K,,) =n — 1. Since it is easy to construct a biclique partition of K, with n — 1 
bicliques, bp(K,,) = n — 1. There are numerous different proofs of this fact [7, 8]. 
Interestingly, all currently known proofs use elementary linear algebra. 

Recently Alon, Sachs, and Seymour (see section 9.12 of [5]) have proposed a 
problem on biclique partitions. The chromatic number of a graph G is the fewest 
number of colors needed to color the vertices of G in such a way that every pair of 
adjacent vertices have different colors. Clearly, the chromatic number of the 
complete graph K, on n vertices is n. Thus, for the complete graph, its biclique 
partition number is strictly less than its chromatic number. They ask if this holds in 
general. Namely, is it true that if G is a graph with chromatic number k, then 
bp(G) < k? 
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An Infinite Set of Heron Triangles 
with Two Rational Medians 


Ralph H. Buchholz and Randall L. Rathbun 


I. INTRODUCTION. If we denote the sides of a triangle by (a, b, c) then the area 
is given by 

A = ys(s —a)(s — b)(s —c) (1) 
where s = (a +b + c)/2 is the semiperimeter. This formula is usually attributed to 
Heron of Alexandria circa 100BC-— 100 AD. However, it was already known to 
Archimedes prior to 212 BC [5, p. 105]. 

Our investigation is limited to triangles with rational sides. Even with sides of 
rational length, “Heron’s” formula shows that the area need not be rational; any 
triangle with three rational sides and rational area is called a Heron triangle. The 
smallest such triangle with integer sides is the familiar (5, 4,3) right triangle (with 
area 6) shown in Figure 1. 


Figure 1. The (5,4,3) right angle. 


If we let (k,/, m) denote the medians that are incident with the respective sides 
(a, b,c), they can be expressed in terms of the sides: 


1 1 1 
k= V2b* + 2c — a’, l= [V2c° + 2a° — b*, m= V2a° + 2b* — c*. (2) 


The medians of the (5, 4, 3) triangle are (k,l, m) = (5/2, V13 /2, V73 /2). This 
triangle has rational area and one rational median—from the midpoint of the 
hypotenuse to the vertex at the right angle. It is an interesting exercise to prove 
that integer right triangles have precisely one rational median [1, p. 31]—the 
median to the hypotenuse. 

But can any Heron triangle have two rational medians? In 1905, Schubert (3, p. 
199] claimed that no such triangle could exist. As Dickson points out [3, p. 208], 
Schubert’s proof was flawed but no such triangle was forthcoming. Despite this 
flaw, the parametrization used by Schubert turns out to be extremely useful in 
helping to uncover a key underlying pattern. 
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Il. THE SCHUBERT PARAMETERS. Consider the triangle in Figure 2, showing 
one of the medians with its adjacent angles. If we apply the trigonometric identity 


a sina 
cot( 5| = 
2 1 — cosa 
to the angle a, say, in Figure 2, then it is clear that the corresponding half-angle 
cotangent is rational only if sina, and cosa, are rational. 


Figure 2. The angles related to Schubert’s parameters. 


Since 


| A 1 b? +k? — (a/2)° 
sina, = TF and cosa, = bk ; 


we see that sina,, cosa, and hence cot(a,/2) are rational for any Heron triangle 
with a rational median k. The same argument applies to all the angles a,,8,, y,, 6, 
adjacent to median <k so all the half-angle cotangents are rational in this case. To 
ensure an unambiguous naming scheme for these parameters we impose a 
counter-clockwise orientation on the triangle around its centroid. Then the angles 
that the median to side a makes with the triangle, beginning with the two at the 
vertex, are labeled a,, B,, y,, 5, aS in Figure 2. The respective half-angle cotan- 
gents are denoted by M,, P,, X,,Y,; see Table 1. We call the set of rational 
numbers (M, P, X,Y) Schubert parameters; it is understood that if no subscript is 
present then the parameters are all obtained from the same median. For the 
(5, 4, 3) Heron triangle, we obtain (M,, P,, X,, Y,) = (4,7) 34) 


TABLE 1. SCHUBERT PARAMETERS FOR A TRIANGLE WITH SIDES (a, b, c) 


M 4A p 4A 
4 Abk + a2 — 3b? — c? 4 Ack + a2 — b? — 3c? 


Ly 4A y 4A 
4 2ak — b? +c? 4 2ak + b? — c?2 


The half-angle cotangents X and Y satisfy XY = 1, while the three half-angle 
cotangents M, P, and X satisfy an important relationship first proved by Schubert: 


M P 2| X 3 

(Maa) -(P- p]=2[4- x). 9) 
Although only two parameters suffice to describe any triangle, we usually consider 
three parameters (M, P, X). It is important to note that if (M, P, X) does satisfy 


equation (3), then so do 32 related 3-tuples. These occur because equation (3) is 
invariant under the following operations: 


(i) replace any parameter by its negated inverse, or 
(ii) interchange M and P while also inverting X, or 
(iii) simultaneously invert all three of the parameters. 
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Since all such 3-tuples correspond to the same Heron triangle, we occasionally use 
an alternate representation. 
Conversely, if we know any set of Schubert parameters, (M, P, X) say, then we 
can calculate the ratio of the sides (a, b, c) from 
a 2X+X""') b M+M" 


ae — = 4 
Cc P+p Cc P+pP7! (4) 


This specifies the triangle up to homothety (a similarity transformation), which is 
sufficient for our purposes. 

In the process of trying to describe all rational-sided triangles with three 
rational medians the first author discovered that any rational-sided triangle, 
(a, b,c), with two rational medians is given by the parametrization (see [1, p. 38]) 


a = 1{(—260° — $70) + (206 —- 6”) + 0+ 1} 
b = 1{( 60° + 2670) + (206 —- 6’) — 6 + 1} (5) 
c= 71{(60? — 60) + (0? + 206 + $7) + 6- o} 


for (7, ¢, 8) constrained such that 7 > 0, 0 < 6,¢@< 1, and ¢@ + 26> 1. In this 
case, if the parameters (7, 6, @) are rational, then the corresponding triangle must 
have rational sides and two rational medians, namely k and /7, but not necessarily 
rational area. The scaling factor 7 is usually set to one. Solving for 6 and ¢ gives 
c-—a+tv2c? + 2a’ —b’ 4 b-—c+V2b* + 2c? —a’ 


a= and 
at+bt+e at+bt+c 


(6) 

Any triangle obtained from a rational triple (M, P, X) has rational sides, 

rational area, and one rational median, while a triangle obtained from a rational 

pair (6, }) has rational sides and two rational medians. It is the unveiling of the 

interplay of these two parametrizations of a triangle that ultimately allows us to 
make progress on the question mentioned in the introduction. 


II. SEARCH RESULTS AND HINT OF A CONNECTION. In 1986, both 
authors, unaware of each other’s work, began searching for Heron triangles with 
two rational medians. One particularly efficient method is to enumerate over the 
rational parameters (6, 6) in equations (5) and then check if the area of the 
corresponding triangle is rational. This technique allowed us to obtain the last two 
triangles in Table 2; meanwhile, naive exhaustion struggled to reach the fourth 
triangle in the list. 

So Heron triangles with two rational medians do exist. Naturally we wondered 
how to find, or better yet generate, more such triangles. The first author noted that 


TABLE 2. SIDES, MEDIANS, AREA OF DISCOVERED HERON TRIANGLES 


Medians 


BB a 420 
626 > 55440 


4368 1241 3673 1657 — 2042040 
14791 14384 11257 AT. 11001 75698280 
28779 13816 15155 = 21937 / 23931600 

1823675 185629 1930456 oo —— 142334216640 
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the first, second, fifth, and sixth triangles of Table 2 have related internal angles 
and asked how this could be exploited. 


IV. DISCOVERY OF THE SEQUENCE OF SQUARES. In October 1989, the 
second author discovered a remarkable connection between the X, and X, 
parameters of related triangles. By selecting the “appropriate” Schubert parame- 
ters and inverting where necessary (denoted by an asterisk), it became possible to 
arrange the four triangles into a logical chain such that the M, parameter from 
one triangle was equal to the P, parameter of the next triangle. We label these 
first four triangles of the chain (see Table 3) by level 1, 2, 3 and 4 respectively, and 
insert the degenerate triangle (2, 1,1), with rational area and medians, at level 0 to 
initialise the chain. 


TABLE 3. TRIANGLES WITH A COMMON {M,(i), P,(i + 1)} RATIO. 


Level i Triangle M,() Pi) X (i) M,(i) P,(i) 


2 
3 
2 
3 
35 
6 


3080 
111 
3256 _* 36480 ** 
165585 70301 


The crucial observation occurred by comparing the X,(i) and X,(i + 1) ratios 
of consecutive triangles. From levels 1 and 2 we observed that (40 7)/(63 - 10) 
. Similarly, levels 2, 3 and 3, 4 imply that (99 - 32)/(800 - 539) = (4) and 
(47. 1850) /(363 - 4736) = (33 5)". In other words, there is a distinct pattern of 
rational squares in the first few products of the numerators and denominators of 
the X,(i) and X,(i + 1) parameters. Furthermore, the denominator of one square 
becomes the numerator of the next square. Now all one needs to specify the next 
triangle in the chain is the denominator of the X product ratio since this would 
determine P(i + 1), X(i + 1) and hence M(i + 1) via Schubert’s equation. For 
example, we set P,(5) = M,(4). Then since 


36480 - 70301 | 88 ) 
numerator(X,(5)) + denominator(X,(5)) | k 


and since P,(5) and X,(5) must lead to a rational value of M,(5) in Schubert’s 
equation (3), one finds that k = 37 and hence X,(5) = 42”. Now calculate the 
Schubert parameters corresponding to the other rational median in this triangle 


and repeat the process. This leads to the sequence of ratios 


1\* (2\? (3 \* (35\* (88\* ( 37 \? ( 4731 \? 
5 (5] [3] (3 =| (Sar | (aaa | a 
This permitted us to generate the next few triangles. For example, the fifth 
Heron-2-median triangle has sides given by (2442655864, 2396426547, 46263061). 


V. CONNECTION TO SOMOS SEQUENCES. There the matter stood for 5 
years, until the two authors were able to re-establish contact. The main question 
was: How was the rational square sequence determined, and could a formula be 
found for it? After intense correspondence from late 1994 to early 1995, we 
obtained some interesting results. 
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The problem with the method described in the previous section is that it 
requires the factorization of numbers that are growing very rapidly. Furthermore, 
there is still some ambiguity about inverting certain parameters and not others. 

We found that all of the (M, P, X) parameters could be formed as a combina- 
tion of two series. Notice that the numerator of the X, parameter in Table 4 is the 
product a,-a,-a,:°a, and the denominator is likewise the product b,-b,-b,- by, 


TABLE 4. DECOMPOSITION OF THE PARAMETER X,(i) 


Numerator Factors Denominator Factors Parameter 


X, (i) 


40 
32 
99 
147 
1850 
36480 
70301 
4301 
6001696 
109393830 
8383913 


where each of the a, and 5, are shifts of one or another of two special sequences. 
There are similar relationships for all the Schubert parameters for our set of 


TABLE 5. THE § AND T SERIES 


7 8 9 10 


11 37 83 274 1217 
8 —1 —57 391 — 455 


triangles in terms of these two series, which we denote by S$ and T. We observed 
that each series (see Table 5) seemed to satisfy an order eight recurrence, namely, 


Dx). 3xG+D -S,,°S,_, + S?, 


$= and 
i-8 
T= 6X09 T_T + Ti 


where y(i) = 0 if 7 is even 
x 1 if i is odd. 

Since these two series were so fundamental, one author sent a query to the 
On-Line Encyclopedia of Integer Sequences (sequences@research. att.com), 
authored by Neil J. A. Sloane. It quickly posted back that the first, S series, was 
indeed a Somos 5 sequence [4, p. 41], and gave the recursion formula 

A;_;*Aj_4 + Aj-2°Aj-3 
A, = __.. 7 
l A;_5 ( ) 
We realised that the T series satisfied the same recurrence with different initial 
terms. In terms of the order 5 recurrence we have 
5 = 1,1,2,3,5 fori =1,...,5 T= 1,-1,1,1,-7 fori=1,...,5 
i \ A, for i > 6, i \ A, for i > 6. 


l 


(8) 
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The half-angle cotangents of our chain of Heron triangles with two rational 
medians are given in terms of the series S and T by 


M,(i) _ Sina Six T; M,(i) _ Siat Sita’ Tia. Ting 
° Si T3411 ° Ti. ° Si+2 " 5543 " Tj 42 " Ti 43 
p (i) __ Si41 Si42 Tia! T3452 Pp (i) __ Si. S543 Lisa (9) 
° S; ; 5543 ; T; " 7343 ° Si44 ; T/.5 ; 7343 
ni, 9° Si 9° T; » Siar TA. °T, 
X,(i) _ W-1 ), +2 +3 X,(i) _ WN). +1 +2 +4 


Si43° Tj: T;*.9 


Equations (9) permitted us to rapidly compute many corresponding triangles using 
multiprecision packages (MAPLE and PARTI) and each such triangle invariably had 
rational area and two rational medians. 


2, . ; 
Sita Sita’ Tid 


VI. SEARCHING FOR A CLOSED FORM FOR S AND T SEQUENCES. Having 
obtained recurrence relations for S, and 7,, we hoped that a closed formula would 
allow us to prove some of the results that we had so far observed only numerically. 
A second posting to the sci.math.research newsgroup prompted a number 
of interesting responses but by far the most impressive came from Noam Elkies, 
who gave two closed formulae for the §, sequence and indirectly provided a 
formula for the 7; sequence. What follows borrows heavily from his reply. 


Numerical evidence suggests that the sequence S, also satisfies recurrence 
relations of the form 


SoS; = 28,_,8,,, — S? if i is even, 
S,-58;45 = 38;_,58;,, — S? if i is odd. 
It is possible to combine these into a single identity by defining 
S., if 7 is even, 
“i \ rS;, if iis odd. 


Replacing S$, with o, or o,/r as appropriate and then equating the preceding two 
recurrences, one finds that r= 2 /3. Hence, the o; satisfy the recurrence 
relation 


_ _ 2 
CO; O14 = V60;-16;41 — 077. 


Because of the similarity of this to a Somos recurrence on sequences of elliptic 
theta functions, one attempts to fit a solution of the form 


++ 00 
o,= bu Yo g™zi". (10) 

In fact, the parameters g, z,b,u can be obtained numerically from the condition 
that the formula for o, hold for the initial values. This leads to 

gq = 0.02208942811097933557356088 . . . 

z = 0.1141942041600238048921321... 

b = 0.9576898995913810138013844. .. 

u = 0.7889128685374661530379575 . .. 
The theta function (10) is rapidly convergent and so we have a numerical, closed 
form expression to evaluate each o, and hence each §,. Using the initial condi- 
tions for the 7-sequence would lead to a similar theta function. 
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However, the numbers S, can also be obtained “arithmetically” from the elliptic 
curve C*/g?“ associated to our theta functions. By 


(i) computing the j-invariant j(E) = j(q*) as a real number, 
(ii) using its continued fraction to recognize j(E) as the rational 11°/612, 
(iii) computing the x-coordinate of the point z on the curve C*/q?“, which 
determines the correct quadratic twist, and 
(iv) reducing to standard minimal form, 


Elkies finds the elliptic curve 
E: y?+xy =x? +x* — 2x, 
which is curve #102-Al in Cremona’s tables [2]. It has a point of order 2 at (0, 0) 


and an infinite order point at P = (x, y) = (2,2). For i = 1,2,3,4,... the x-coor- 
dinate of the i-th multiple of P on E in lowest terms is 

2:1 1% 2:27 3% 2-57 11% 2-377 

Cs ir a 
Indeed, the numerator of i * P is always S? or 2S? according as i is even or odd. 
Notice that the denominator is precisely T,’. The two sequences are very closely 
connected. Not only do they satisfy the same recurrence relation, but the initial 
conditions are no longer arbitrary; given one it is possible to construct the other. 

Unfortunately, we were not able to use either of these closed forms to prove 

that the triangles generated from equations (9) and (4) always have rational area. 
However, the elliptic curve does turn up again and leads to such a proof from a 
different direction. 


VII. TRIANGLES IN THE 6¢-PLANE LEAD TO FIVE ELLIPTIC CURVES. At 
this stage we used equations (9), (4), and (6) to generate the values of 6 and ¢ 
corresponding to the first 100 terms of the two Somos sequences S, and T;. We 
plotted these parameters, considered as points corresponding to distinct Heron 
triangles with two rational medians, in the 6¢-plane (Figure 3) and the structure 
here was a surprise. 

Rather than being randomly distributed in the region, the points seem to lie on 
five distinct curves. During this process we discovered that the points were being 
distributed to the five curves in a periodic way with a cycle length of 7. The points 
generated by the parameter set (M,(i), P,(i), X,(i)) visited the curves in the order 
{1, 2,3, 4,1, 2,5}. Similarly, the points generated by the set (M,(i), P,(@i), X,@) 
visited the curves in the order {2, 1, 4,3, 2, 1,5}. As a result, it was easy to isolate 
the rational coordinates of enough points on each curve to determine the corre- 
sponding equations: 


C,: 270°? — 06(0 — 6)(80? + 1106 + 862) — 306(502 — 06 + 57) 
—(0— 6)(07 + 406 + 67) — (307 — 706 + 367) — 3(0 - 6) —1=0, 
C,: 30%” — 206(6 — &) — (07 + 666 + $7) +1=0, 
C3: 06(0 — $) — (04 + 110% + 30%? + 11043 + $+) 
—2(0° — 6°) + 1006 + 200-— $) +1=0, 
C,: 06(0 — 6) + 06 + 210-— $) -1=0, 
Cs: (0—1)°¢* + 2(60 + 1)(0° + 207 — 20+ 1)6 + (20-—1)(0+ 1) = 0. 
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Figure 3. Heron triangles with 2 rational medians in the 6¢-plane. 


We conjectured that all the rational points on these five curves produce triangles 
with rational area. Since the triangle has two rational medians, one can form (6, @) 
parameters for either median. We call these dual parameter sets for the triangle. 
The transformation that takes (6, @) to its dual point (6', ¢’) is given by 


207+ 66+ 0+¢6-1 —  -06-267+0+6+1 


306+ 0-hd4+1 ° 306+0-¢+1 
Under this mapping the curves C, and C, are dual, as are C, and C,, while C; is 
self-dual. Thus it is sufficient to prove that all rational points on the curves C,, C4, 
and C, say, correspond to Heron triangles with two rational medians. 

Next, we find that C,, C, and C; are all birationally equivalent to the same 
elliptic curve so we need to prove the conjecture only for C,, say. These three 
curves are quadratic in @ and the respective discriminants are 


Disc (C,) = 4(40* + 80° + 50? — 26+ 1) 
Disc (C,) = 64 + 20° + 5607-86+4, and 
Disc (Cs) = 402(6 + 1)°(64 + 263 + 502 — 86 + 4). 
Since we are searching for rational points on each of the curves, we require the 


discriminant of each to be a rational square. All the rational points that force this 
correspond to rational points on the elliptic curve 


Y* =X*4+2X°4+5X*-8X4+4. 
For C,, we map X to —1/06 while for C, and C,; we just map X to @. Finally we 
were able to prove the following 


! 


Theorem. Every rational point on the curve 
C,: 0% — 067 + 66 + 20-2¢6-1=0 


such that 0 < 0,6 < 1 and 26+ > 1 corresponds to a triangle with rational sides, 
rational area, and two rational medians. 
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The proof requires several technical lemmas that will appear in a forthcoming 
paper. Here we just give an outline. 


(i) The 0, ¢ inequalities are obtained from the triangle inequalities. 

(ii) Reduce the squarefree part of the square of the area from degree 11 to 
degree 8 by applying the curve C, to Heron’s formula (1). 

(iii) Transform the curve C, to minimal Weierstra8 form to obtain E, the 
elliptic curve found by Elkies in Section VI. 

(iv) Finally, use induction in the group E(Q) to show that any point that 
corresponds to a triangle with rational area leads, in all possible ways, to 
another point corresponding to a triangle with rational area. 


VIII. TWO ISOLATED TRIANGLES. The story does not end here since two of 
the triangles found by computational search (the third and fourth entries of Table 
2) do not lie on any of our five elliptic curves. Although these two triangles were 
found using equations (5), they are probably not parametrizable by equations (9) 
since the five curves were numerically obtained from the latter. Each of these 
isolated triangles has associated with it six triangles that have a rational median 
and rational area and share a common Schubert parameter ratio. What role these 
ratios play is as yet undetermined. 

We are continuing further research into these two triangles, as we conjecture 
that all Heron triangles with two rational medians are produced by formulae 
similar to those we have presented in this paper. However, finding more examples 
like these two appears difficult. 
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Connected Sprouts 


T. K. Lam 


The game of Sprouts is an interesting game that has been discussed by several 
authors (see [1,2,3]). This two-person game begins with m points on a piece of 
paper. A play or a move is drawing an arc joining two points or a point to itself, 
and then adding a new point on the arc subject to the following conditions: 


(i) the arc does not cross another arc, and 
(ii) the degree of each point does not exceed 3. 


The game ends when no move can be made and we obtain a graph G, with 
maximum degree 3 and minimum degree 2. Call G, the end graph of the Sprouts 
game. It is known that the number of plays in a game is between 2m and 3m — 1 
inclusive (1, 2, 3]. If we ignore all vertices of degree 2 in G,, then we obtain a graph 
G whose vertices are all of degree 3, which we call the cubic graph of the game. 

In [2], Mark Copper described some properties of cubic graphs. He asked for 
tight lower bounds on the number of plays when the cubic graph is connected and 
when it is 2-connected. We answer these questions and point out an error in [2]. 

The connected graphs that can be obtained with 2™ moves and m < 3 are 
given in [1, p. 566] and are shown in Figure 1. The original points are represented 
by the big dots. The small dots represent points added during a play. 


DOW 


Figure 1 


In [2], Copper stated 


Proposition 5. Suppose that the cubic graph G anses from a complete game _ of 
Sprouts on m vertices in p plays. If G is connected and m > 2, then p > 2m. 


But this is not correct, and Figure 1 gives a counterexample for m = 3. A 
correct result is: 


Proposition. Suppose that the cubic graph G antses from a complete game of Sprouts 
on m vertices in p plays. If G is connected, then 

p>=2m ifl1<m<3, and 

p=2m+1 ifm > 3. 
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Equality holds for some G for every value of m = 1,2,3,... . 


Proof: It is obvious that G is connected if and only if G, is connected, so we can 
look at Gy instead. For m = 1,2 and 3, 2m is just the smallest possible number of 
moves for a Sprouts game. Figure 1 shows that it is possible to get a connected 
graph in just 2m moves. This leaves us with the case where m > 3. 

We will modify the proof given by Copper. Let us assume that the game of 
Sprouts produces a connected end graph G, in 2m moves. A quick check shows 
that G, has 3m vertices, 4m edges, and after applying Euler’s formula, m + 2 
faces. Denote the number of degree 2 vertices of G, by r. By counting the degrees 
of all the vertices, we find that 


2r + 3(3m —r) = 2(4m), andr =m. 


If deleting a degree 2 vertex disconnects Go, call this vertex a bridge. Note that a 
degree 2 vertex is a bridge if and only if it borders exactly 1 face. So, if we let b 
denote the number of bridges, counting the number of faces gives 


b+2(r—b)<m+2,andb>m — 2. 


When we remove all the bridges, G, breaks up into b + 1 connected components. 
At least 2 of the connected components are non-trivial subgraphs to which only 1 
bridge is attached. Let us call these end-components. Each end-component must 
have at least 2 interior faces. By counting the number of faces again, we find that 


m+2>b+4,andb<m -— 2. 


Hence b=m — 2. Since b + 2(r—b)=m + 2, each face in G, must have a 
degree 2 vertex in its border. Since the interior faces of an end-component do not 
border a bridge, we are forced to conclude that there are exactly 2 end-compo- 
nents, each with 2 interior faces sharing a degree 2 vertex in their common border. 
We now turn to counting the number of edges. The 2 end-components must have 
at least 5 edges each. Each bridge has two adjacent edges and the other connected 
components have at least 2 edges each. This gives 2b + 2(b — 1) + 2X5 = 4m. 
From this equality, we know the structure of G,. There are exactly 2 end-compo- 
nents with exactly 5 edges each. The other connected components have exactly 2 
edges each. However, it can be verified that, for b > 2, this graph cannot be 
obtained from a Sprouts game. Hence, p > 2m + 1. Figure 2 illustrates such a 
graph when m = 5. 


Figure 2 
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To show that equality is possible for each value of m, it suffices to construct a 
Sprouts game that ends in a connected graph in 2m + 1 moves. We illustrate the 
construction with an example for m = 5. We first play a game of Sprouts on 2 
points and get the graph shown in the middle of Figure 1. For the other points, we 
draw an arc connecting the point to itself in concentric circles around the first 2 
points. This gives the graph shown in Figure 3a. Then we connect the circles to 
each other by making the moves shown in Figure 3b. This construction can be 
performed for every m > 3. a 


Figure 3a Figure 3b 


Figure 3 


Our proposition is equivalent to the Fundamental Theorem of Zeroth Order 
Moribundity stated in [1, p. 564]. 

Copper obtained a lower bound on the number of moves if the final graph 
obtained is 2-connected. 


Theorem [2, Proposition 4]. Suppose that the graph G arises from a complete game 
of Sprouts on m vertices in p plays. If G is 2-connected, then 


7 2 
p23 3mM— 3. 


We show that the improved lower bound [(7m — 2)/3] is tight. Without loss of 
generality, we may take the m given points as lying on a circle. Joining adjacent 
pairs of points by arcs (and adding a new one on each arc) gives us a cycle of 2m 
points. Now take 3 consecutive points on the cycle, connect the 2 end points and 
then join the middle point to the newly created point. We repeat this for the next 3 
consecutive points on the cycle until we are left with fewer than 3 points on the 
cycle. If there is no point or 1 point left, the game ends. If there are 2 points left, 
we make a play on them and the game ends. 

Figure 4 illustrates an example for m = 4. 

It is a simple exercise now to count the number of moves. We obtain: 


7k if m = 3k 
D=($7kK+2 ifm = 3k + 1, and 
7k+4 if m = 3k + 2. 


This agrees with [(7m — 2)/3]. 
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Figure 4 


At the end of his article, Copper also asked which planar cubic graph are cubic 
graphs of some Sprouts game. Though it is tempting to conjecture that all simple 
planar cubic graphs can be so obtained, it is not true. A counterexample is shown 
in Figure 5. 


Figure 5 
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Regular Expressions 
for Program Computations 


Ronald E. Prather 


In the brief history of computer science, no single article has attracted more 
attention nor stirred more controversy than a seemingly innocuous 1968 letter to 
the editor (of the Communications of the ACM) by Edsger Dijkstra [3], titled 
“Goto Statement Considered Harmful.” We were just emerging from the Fortran 
era, when programming had not yet been thought of as an art or even a craft. A 
diagram of program flowlines, what came to be called a flowchart, would very 
nearly resemble a plate of spaghetti—go to here and do this, then go to there and 
do that. Programs were “held together with baling wire,” little thought being given 
to their organizational structure. Dijkstra argued that an “unbridled use of the 
goto statement” was the heart of the problem. In his opening sentence, he asserts 
that “the quality of programmers is a decreasing function of the density of goto 
statements in the programs they produce.” He would later argue for the elimina- 
tion of the goto statement altogether. At the time, nothing could have been more 
controversial. Many programmers argued that some algorithms were impossible (or 
at the very least, unnatural) to implement without the use of the goto. But new 
programming languages were on the horizon (e.g., Pascal particularly), offering 
such structural constituents as the while do, repeat until, if then else, and case 
statements, whereby one could organize a program into successive levels of 
refinement, giving some credence to the Dijkstra claim. 

Proponents of the Dijkstra discipline of programming began to use a cryptic (at 
the time) and somewhat obscure result of Corrado B6hm and Giuseppe Jacopini 
[1] as justification for their optimism and enthusiasm. In effect, their work seemed 
to indicate that any algorithm could be written as a structured program [2], i.e., as a 
program involving only repetition (whether of the “while do” or the “repeat until” 
variety), selection (typically characterized by the “if then else” construct) and 
sequence of (possibly compound) statements, these three constructs perhaps being 
nested one within the other to an arbitrary depth, down to the level of the 
elementary processes of assignment, input, and output. And most importantly, there 
was seen to be no need whatsoever of employing the “harmful” goto statement. 

But the BOhm-Jacopini result was not all that clearly understood at the time. 
Even Dijkstra himself stated only that “they seem to have proved the logical 
superfluousness of the goto statement.” For it happens that it is not all that easy to 
give a totally convincing proof in elementary terms. We take up just such a 
challenge in this survey. We use the opportunity to introduce the reader to a 
modern software engineering framework for the investigation, using topics that are 
important to a contemporary computer science research area known as software 
metrics, and we further employ an algebra of regular expressions, as most com- 
monly encountered in the theory of automata, all toward offering a modified form 
of the now-classical Bohm-Jacopini result, and thus achieving a presentation 
blending the new with the old. 
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PROGRAM FLOWGRAPHS. To begin, one must propose a model of program 
computation. We suppose that the reader is familiar with the notion of an alphabet 
>, ie., a finite set of “symbols”, and with the idea that a finite sequence of such 
symbols is said to constitute a word [9]. The null word is given a special representa- 
tion « considered as an empty sequence of symbols. It is the identity element in a 
monoid with respect to a binary concatenation operation on words. A collection L 
of words is then called a language (over the alphabet >), and we will have occasion 
to make use of the following operations on languages: 


(1) concatenation; LM = {xy:x © L and y € M} 
(2) substitution: L(a:M) = {x(a:y):x(o) € L and y € M} 
(3) join: L UM ={x:x © L or x © M} 


where in (2), we merely substitute individual words of M for each occurrence of 
the specific symbol o € & in words of L. 


Example 1. Suppose > = {a, b,c, d, e} and we are given the language 
L = {e, ece, ecece,...} 
wherein we designate o = e in (2). If M = {a, bd} then 
L(e:M) = {a, bd, aca, acbd, bdca, bdcbd, acaca,...} 


In the program model to be proposed, the symbols a € & play the role of the 
individual elementary processes of a particular algorithm, whatever they may be. 
We thus are led to an abstraction of the notion of a flowchart, so as to stay within 
the realm of arbitrary (perhaps wildly unstructured) programs. As indicated, we do 
not give any attention to the specific nature of the elementary processes of an 
algorithm, identifying them only as symbols. Neither do we give any specific 
attention to the exact nature of the program decisions; they will be identified only 
as nodes of a flowgraph. With all of this in mind, we define a program flowgraph F 
of order n over the alphabet > to consist of 3n + 1 edges among 2n + 1 nodes or 
vertices, one distinguished and named X,, the others evenly divided between 
decision and junction nodes, with an orientation of the edges such that: 


(i) decision nodes have indegree = 1 and outdegree = 2; 
(ii) junction nodes have indegree = 2 and outdegree = 1; 
(iii) the distinguished node has indegree = 1 and outdegree = 1; 
(iv) every vertex lies on a circuit through X, (and such a circuit represents a 
computation of F). 


Furthermore, each of the edges is to be labeled with a word over the alphabet &, 
the intent being to represent the sequence of elementary processes performed in 
traversing the individual edges of the flowgraph. The distinguished vertex X, 
serves to identify a further pair of nodes, start = 0, and stop = 1,, owing to the 
unique pair of edges 1 — X > 0. We omit subscripts on X,0,1 when the flow- 
graph F is clear from the context. Except for the vertex X, we have an underlying 
cubic graph [6], i.e., every vertex (except X) has degree three. 

In our flowchart examples, e.g., Figure 1, decision nodes are drawn as black or 
solid circles, whereas junction nodes are drawn as white or open circles, merely to 
call attention to the distinction. The distinguished vertex X is drawn as an 
encircled cross (for ‘X’). Because of the uniqueness of X and the property 
1 — X — 0, it is not necessary to identify 0 and 1 in our drawings. 
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a bd E C 


(a) (0) 


(c) 
Figure 1 


Example 2. While a program flowgraph can represent any program whatsoever 
[11], we will be especially interested in the so-called structured flowgraphs, which 
are built from only the sequence, selection, and repetition constructs. A sequence 
of elementary processes is represented as a word on the alphabet &, as in the case 
of bd in Figure 1(a); descriptions of more general sequences follow. A selection 
from among two alternating processes involves a decision, as in the case of the 
black circle of Figure 1(a), which by the convention we have introduced, serves as 
the start node for the flowgraph G. Thus, as computations of the selection that 
Figure 1(a) represents, we perform the elementary processes b and d sequentially, 
or we perform the elementary process a, depending on the outcome of the 
decision. It is important to remember that the exact nature of the decision is of no 
concern in the theory we are developing here. In Figure 1(b), flowgraph F is an 
instance of a repetition. Since the start node here is a junction node, and we are 
led (in performing a null process, symbolized by the null word) to a decision node, 
we may then be led to stop or to perform the elementary process c, depending on 
the outcome. In the case of the latter, we are back where we started, so that c may 
be executed arbitrarily many (perhaps zero) times in repetition. It is well that such 
computations first be understood in the context of small examples, as illustrated 
here. 


Among the ways for building larger flowgraphs from given ones, we will be 
especially interested in the following two constructions: 
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sequence FoG: Define 07,.g¢ = 0, and 1;.g =1g while merging X; =X, = 
Xrocg. then replace the pair of edges 1, ~ X,; and X¢ — 0g by 1; —> 0g while 
concatenating the two labels into one. 


nesting Fle — G]: Assume that the edge e = (V,W) of F is labeled by e. Define 
Ore+G) = Or and 1p. .G) = 1r and also Xz... ¢) = Xp, while (eventually) dis- 
carding X, and the edges X, ~ 0, and lg — XG, and forming edges V > 0, 
and 1, — W in place of e, respectively. 


Each is an associative operation, so that compound sequences and multiple 
nestings can be introduced without regard to the order of their construction. 


Example 3. If G is the flowgraph of Figure 1(a), seen to represent the program 
fragment 

if... then a else bd, 
and if F is the flowgraph of Figure 1(b), taken to represent 


while ... doc, 
(we do not identify the decisions (...)), then we may nest G into F at the edge e 
labeled by ¢ in Figure 1(b), to obtain the composite flowgraph F[e — G] shown in 
Figure 1(c). 


Now let F be any program flowgraph over the alphabet >. As a way of 
describing the multitude of combinations of computations that might be performed 
in executing the program that F represents, we introduce the computation set of F, 
defined to be the language 

L(F) = {x,x, ++: x, such that x; labels e, (1 <i <r) and 

e,e, ‘ e, isa path from X to X inF}. 
In effect, we thereby provide an elementary operational semantics for our program 
flowgraphs, saying that L(F) is the “meaning” of F, describing as it does all 
possible sequences of elementary processes that could result. We note, however, 
that this is truly an “elementary” semantics, far less detailed than the denotational 
semantics that one ordinarily introduces in a programming language context [14]. 
But we feel that it is sufficient for our purposes. 

The connection with the operations on languages, as previously introduced, is 
quite apparent and straightforward, as seen in the following pair of elementary 
rules: 


(1) L(F eG) =L(F)L(G) 
(2) L(Fle > G]) =L(F)(e:L(G)). 
In the latter, we view F (on the right) as a program flowgraph over the extended 


alphabet = U {e}, having introduced the new label e for the edge (e) previously 
labeled by e. 


Example 4. Let F and G be as given in Figure 1. Then in viewing e as a member 
of the extended alphabet {a, b, c, d} U {e}, we have 


L(F) = {e, ece, ecece,...} = L (Example 1) 
L(G) = {a, bd} = M (Example 1) 
and 
L(F)(e:L(G)) = L(e:M) = {a, bd, aca, acbd, bdca, bdcbd, acaca,...} 
- L(Fle > G]) 
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REGULAR EXPRESSION ALGEBRA. The so-called regular expressions [13] have 
found a number of applications in computer science. But mainly they are known 
for representing the languages that are recognized by finite state automata [7]. Our 
effort to confirm a version of the BOhm-Jacopini statement rests on showing that 
regular expressions can be seen to represent the computation set of any program 
flowgraph. 

The algebra of regular expressions over the alphabet > is the smallest collection 
of expressions containing the symbols o € > the distinguished elements 0, 1—and 
closed under the three binary operations: 


i) sum: a+ B 
(ii) product: aB 
(iii) star: a* B 


while satisfying the axioms: 


(la) a+B=Bt+a (lb) at+a=a 

(Qa) at+(Bt+ty=(atB)t+y (2b) a( By) = (aB )y 

(Ba) a( B+ y) =aB+ ay (3b) (a + B)y = ay + By 
(4a) 0a = 0 (4b) 0+ a=a 

(5a) la=a (5b) 1* a =1*(1 + a) 

(6a) aBxy=a(B*ya) (6b) ax B=(a* B)Bat+a 


Salomaa has shown [12] that we have here a complete (and consistent) axiomatic 
system, relative to the interpretation that is about to be given. 

In this framework, we now present the canonical inductive interpretation of 
regular expressions. For each regular expression a over the alphabet >, we define 
the language (a) represented by a as follows. First we set 


A(o) = {o} for 0 € &, and 
A(0) = Gand A(1) = {e}, 
then inductively, we define 


(i) Ala t+ B)= Aa) U AB) 
(ii) ACa@B) = Ma) B) 


(iii) Aa*B)= VU AaACB)ACa))K 
k=0 


One checks that if we define equality of regular expressions according to the 
agreement 

a = B if and only if A(a) = A( B), 
then the entire axiom scheme outlined above is satisfied. 

We will eventually draw an explicit connection between the languages ()) 
corresponding to regular expressions and the languages (L) corresponding to 
program flowgraphs. For the present, we conclude this brief survey of the algebra 
of regular expressions with an elementary result making use of substitution and 
relating to (2) in our earlier discussion of languages: 


Lemma. Let p = p(a) be a regular expression involving (among others) the symbol 
ao € y. If we substitute for o , wherever it occurs, the regular expression ww, then we 
obtain a composite regular expression p(a( 1)) and 


AC p(o( H#))) = AC e)(o:AC H)) 
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Example 5. In relation to all four of our previous examples, if we are given the 
pair of regular expressions: 


oO 


p=exc Mp) = U e(ce)" 
k=0 


{a, bd} 


w=at bd AC LL) 

then 
p(e(m)) = (a + bd)*c 
AC p(e( #))) = A((a + bd) *c) 


v A(a + bd)(A(c)A(a + bd))* 
k=0 

L) {a, bd}(c{a, bd})* 
k=0 


AC p)(e:AC #)) 


FLOWGRAPH DECOMPOSITION. The field of “software metrics” [4] falls within 
the general domain of software engineering. Whereas software engineering has 
come to refer to activities concerned with turning software design, development, 
and maintenance into a disciplined engineering practice, software metrics has come 
to be understood as just about anything within software engineering that has a 
quantifiable feel to it. Most broadly interpreted, this would include anything 
having to do with predicting software product costs, measuring and improving 
programming productivity, and measuring and predicting the quality and complex- 
ity of software products. The majority of its practitioners, however, intend a 
somewhat narrower interpretation, whereby one seeks first of all to provide a 
mathematical model for the notion of a program, then over that model, to 
introduce various numerical-valued functions (metrics) that attempt to measure 
one or more of the attributes implicit in the above listing, e.g., programming cost, 
measures of testing complexity, projected maintenance costs, etc. 

We have already introduced an appropriate flowgraph model for the software 
metric activity. In the important subdiscipline of “hierarchical software metrics” 
[11], it must be possible to evaluate a metric as a recursive operation over a certain 
“flowgraph decomposition,” one that we are about to describe. That being done, 
rather than to venture off into the applications in the hierarchical metric theory, 
however, we use the decomposition as a vehicle for deriving a modified form of the 
Bohm-Jacopini result. 

Recalling the flowgraph sequence and nesting constructions described earlier, 
our decomposition theorem will be seen to answer the “inverse question”: Given 
an arbitrary flowgraph, how may it be decomposed using these two operations? In 
this connection, it will then be important to give special attention to the prime 
flowgraphs, i.e., those that are irreducible or indecomposable, with respect to both 
sequence and nesting. And with this in mind, the fundamental result is a recursive 
process, discovered and attributed anew to various researchers, that we choose to 
identify here as follows: 


Il 


Theorem 1 (Prather-Guilieri [10]). Every program flowgraph F has a unique decom- 
position: 
F = P,° cee ° Ple; > F,,|° see oP, 
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fa 


Figure 2 


into a sequence of primes. Each prime P, may have an edge e, onto which a maximal 
flowgraph F,, is nested. The decomposition applies (that is to say, each F,; is in turn a 
sequence of primes, etc.), recursively, all the way down to the level of elementary 
processes. 


Corollary. The computation set of any program flowgraph F may be written as a 
concatenation: 


k 
L(F) = [1L(A le, > FJ) 
(= 
of the computation sets of its top-level primes. 


Here, we have used Rule (1), where we remind the reader that the notation 
Ple — F] refers to a nesting of F on P. It has been shown [11] that the prime 
flowgraphs are the triple-connected ones, i.e., those for which one must cut 
through at least three flowlines in order to separate the flowgraph. Examples will 
be given shortly. But first, we turn our attention to a companion result that 
provides a recursive enumeration of the entire class of prime flowgraphs: 


Theorem 2 (Fenton-Whitty [5]). Every prime of order n + 1 is obtained by “grafting” 
(see Figure 2 where five edges—{d,, d,, e, f,, f.} replace two—{d, f}) a decision-to- 
junction edge onto a prime of order n, and as a result, the prime flowgraphs can be 
effectively enumerated (albeit with duplication) as an infinite union 


S= US, 
n=0 
where we identify (see Figure 3 and the discussion to follow): 
Sy = {C} 
S, = {D,, D)} 
S, = {£,, E,, E3, E,, Es, Es} 


etc., and in general, §, is the subclass of primes of order n. 
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E, E, | E; E, g Es : 
Figure 3 


This result is an adaptation of a well-known principle in the theory of triply-con- 
nected cubic graphs [6]. 

The degenerate circuit flowgraph C (of order 0) can be used, say with its only 
edge labeled by a word x over the alphabet >, as a means of nesting a sequence of 
elementary processes onto a given program flowgraph. The so-called Dijkstra 
flowgraphs D, and D, can be seen to represent the programming language “if 
then else” and “repeat while do” constructs, respectively, the latter a combina- 
tion of the conventional “repeat until’ (testing for terminating the repetition after 
execution of a process) and “while do” (testing before execution). The reader 
should consult Example 4 and Example 5 for a discussion of the regular expres- 
sions that relate to the Dijkstra flowgraphs, in anticipation of the Corollary to 
follow. 

But first, we note that the primes, in general, are really just graphs, rather than 
program flowgraphs. That is to say, they do not have labels on their edges, unless 
the edges are viewed as being labeled by themselves—as symbols of a kind of 
auxiliary alphabet. And yet, regular expressions can be formed with these edges- 
as-labels, leading eventually to a computational set, when each individual edge 
label is substituted by a genuine regular expression (over the alphabet >) relating 
to a nesting. This is the sense of the next result. 


Corollary. If Ple;] is a prime flowgraph, then there is a regular expression pp(e;) over 
the alphabet of edges-as-symbols, such that 


L(P) = AC pp). 


Proof: We proceed by induction over the flowgraph order n. If n = 0 we have the 
degenerate case P = C that is easily hafidled. When n = 1 we have P = D[a, b] 
or P = D,[a, b], where without loss of generality, we identify only the edges not 
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involving the distinguished node X, 1.e., a for the left edge and 5 for the right 
edge in each instance of Figure 1. Clearly the regular expressions 


Pp, =at+b and Pp, =a*b 


Satisfy the indicated property, as seen in Example 2. In arguing inductively, 
however, it is better to show, more generally, that for each pair of vertices 
u,v # X, there is a regular expression p,,, representing all paths from u to v, for 
then we merely set pp = Pyo Po, P; x. Obviously this more general assertion is true 
for n = 0,1. And if it is true for all prime flowgraphs of order n, and P’ is of order 
n+ 1, we set P’ =P +e in Theorem 2. Then in using the notation of Figure 2, 
together with the substitution d = d,d, and f =f, f,, we use the regular expres- 
sions p,, of the flowgraph P to construct 


Priv = Puv + Pux( dele * Pyx) Pyw 
—{paths from u to v in P} U {paths through e} 
= {paths from u to v in P’}, 
as required. 


STRUCTURED PROGRAMS SUFFICE. The notion of a structured program (one 
employing only the sequence, selection, and repetition constructs [2]) is fundamen- 
tal to what is now, thanks to Dijkstra [3], considered “good” programming 
practice. But the implementation of primes of order n > 1 requires the use of goto 
statements in those programming languages that otherwise support only the 
sequence, selection, and repetition. Now according to the Bohm-Jacopini theory, it 
is always possible to restructure an arbitrary given flowgraph so as to avoid the use 
of such higher-order primes (and to thereby circumvent the need for employing 
goto statements). That is the conclusion that we come to here. 


Theorem 3. Corresponding to each program flowgraph F is a regular expression py 
with 
L(F) = AC pr). 
Proof: By Theorem 1 we have 
F=P,o-- Pile, ~ F,,| ovo P 
and with an inductive hypothesis that the theorem is true for flowgraphs of lesser 
order than that of F, there are regular expressions p;; with LCF;;) = AC p; p- Then 


according to the Corollary to Theorem 2, there are also regular expressions pp. 
with L(P;) = AC pp) on an alphabet °! edges-as-symbols. We define 


~ Tole i Pi) )) 


and in combining all of our earlier results, we obtain: 


L(F) =L(Pio- ° Pie, > Fj]°+°P,) Theorem 1 

k 

= [1L(P I; ~ F,;]) Corollary, Theorem 1 
k 

~ FP) (6,:1(4) Rule (2) 
j= 
k 

= [1A pp )(e,:L(F, )) Corollary, Theorem 2 


~. 


128 REGULAR EXPRESSIONS FOR PROGRAM COMPUTATIONS [February 


=I] A( pp )(e;,:A( pi;)) Inductive hypothesis 
k 
= 1) M pp (e;( 71) Lemma 
k 
= | I] pp(e;( p.)) Definition (ii) 
= X( pr) Definition of pr, 


as required. 


Corollary (BOhm-Jacopini). Corresponding to each program flowgraph F is a struc- 
tured program flowgraph with the same computation set. 


Proof: There is a one-to-one correspondence (see the opening lines of the proof of 
the Corollary to Theorem 2) between the three binary operations of the regular 
expression algebra and the three structured programming constructs—sequence, 
selection (D,), and repetition (D,). 


Our result is not an “exact replica” of the BOhm-Jacopini theorem: “Given any 
program, there is an ‘equivalent’ structured program” since we have used a more 
elementary semantics than was their intent. In the true BOhm-Jacopini sense, two 
programs are considered to be equivalent only if they compute the same (partial) 
function, relative to certain well-defined input and output program variables. Since 
we have “abstracted away” all references to the exact nature of individual program 
decisions and the specific details of the elementary processes of an algorithm, in 
fact—we have no variables whatsoever, the idea of a program “computing a 
function” is not meaningful in our setting. And it is precisely for this reason that 
we are able to describe the “meaning” of a program (its computation set) at the 
lowest automata-theoretic linguistic level, that of regular languages. The more 
extensive semantics would take us to the level of the languages accepted by Turing 
machines [8], where the whole argument is more complex. Nevertheless, we are 
confident that our construction (that of Theorem 3 and the Corollary to Theorem 
2) represents a “restructuring” process that could easily be transformed in the 
more extensive semantic domain, in order to obtain the corresponding result in the 
Bohm-Jacopini sense of program equivalence. 
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Figure 2 


crossings, which turns out to be optimal. Since the complete graphs have a very 
special structure indeed, we can hope to calculate their crossing numbers. 

There are conjectures for the crossing numbers of both the complete and 
complete bipartite graphs [3]: 


ns HEE 


(kno) [Fl [15 | 


However, these remain open. Some partial results are known: the former has been 
verified for n < 10, while the latter holds for m < 6 and all n [4] and for m =7 
and n < 10 [7]. 

The best known drawings of K,, , and K,, achieve these values. The description 
of such a drawing for K,, ,, is quite simple. Divide both the m-set and the n-set 
into two as-equal-as-possible parts. Place the m along the y-axis, with half above 
the x-axis and half below. Similarly, place the nm along the x-axis, with half to the 
left of the y-axis and half to the right. Now join the m to the n using straight lines. 
The second drawing in Figure 1 is such a drawing of K, ,. 

Turan’s story suggests a variant of the crossing number problem for complete 
bipartite graphs: find the smallest number of crossings in a cylindrical drawing of 
K,,, n» that is a drawing of K,, , on a cylinder such that each class of n vertices is on 
one of the two boundaries of the cylinder. 

One way to get a drawing of K,,, in the plane is start with a cylindrical drawing 
of K, , and then use the top and bottom of the cylinder to complete the drawing 


n,n 


of K,,,. See Figure 3 for the case n = 4. 


and 


Obviously, this drawing of K,, has 2 4 more crossings than the cylindrical 


drawing of K,,,; this type of drawing of K,, is described in [6]. With an 
appropriate choice of cylindrical drawing of K,, ,, the conjectured crossing number 
of K,,, is obtained this way. 

One might hope that some better cylindrical drawing of K,,,, exists and, 
therefore, a better drawing of K,, would result. In Section 2, we associate a 
quadratic form with such drawings. Minimizing the quadratic form, we find the 
best cylindrical drawing of K,, ,, and so get the best drawing of K,,, of this type. 


n,n? 
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Figure 3 


In Section 3 we shall discuss asymptotic values of the crossing numbers of K,, 
and K,, ,. It is easy to see (and will be discussed in Section 3) that the sequences 


2 
cr(K,,)/ (") and cr(K,, ,)/ (3 are monotonically increasing and each term is less 


than 1. Therefore, the limits 


both exist and are at most 1. The conjectures on the values of the crossing numbers 
cr(K,,) and cr(K,, ,,) imply that the limits are 3/8 and 1/4, respectively. We prove 
in Section 3 that the latter implies the former. 


2. CYLINDRICAL DRAWINGS OF K,, ,. We want to determine a lower bound 
on the number of crossings in any cylindrical drawing of K,, ,. We need to discover 
just what forces a crossing in the drawing. Consider, first, a single vertex vu of K,, ,. 
All the edges incident with v are drawn across the cylinder to vertices on the other 
boundary. No two of these edges cross in an optimal drawing; see Figure 4. 


Figure 4 


Now consider two vertices v and w on the same boundary. There are several 
possibilities for how the edges incident with these vertices are drawn, but we can 
see (Figure 5) that no two edges cross more than once in an optimal drawing. So 
how can two edges be forced to cross? 
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becomes 


Figure 5 


A little thought yields a simple observation. 


For each vertex i on the inside boundary, there is a vertex x, € {1,2,...,} on the outside 
boundary such that the simple closed curve consisting of the edges from i to each of x, and 
x; + 1 (the arithmetic being taken modulo vn), together with the little segment of the outer 
boundary of the cylinder joining x, and x; + 1 bounds a disc containing the inner boundary of 
the cylinder. 


As examples, in Figure 6, x, = 5 and x, = 7. 


Figure 6 


Now it is a simple matter to get a lower bound on the number of crossings given 
that the values of x,, x,,..., x, are known. We need only deal with these in pairs, 
i.e., it suffices to calculate the number of crossings among edges incident with the 
vertices i and j on the inside boundary. If we pick two vertices r and s between 
x; + 1 and x,, say, then, among the four edges with ends : or j and r or s, there 
must be at least one crossing (Figure 7a). Similarly, if r and s are both between 
x; + 1 and x;. But if one is between x; + 1 and x; and the other is between x; + 1 
and x,, then there need not be a crossing (Figure 7b). 

Assuming that 1 < x; < x; <n, it follows that there are at least 


Nn+Xx;—X; 
2 
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Figure 7 


crossings in the drawing among edges incident with i and j. Therefore, a lower 
bound for the total number of crossings in the drawing is 


x [!* * 4 [" — |x; “i. 
l<i<j<n 2 2 


Using the relation (; = y(y — 1)/2, we see that the lower bound is the 


function 
n\ 2 
Fttemd = (5) #( Lo lanuP)-n{ Do lana), 
1<i<j<n 1<i<j<n 
Ordering the variables so 1 <x, <x, < + <x, <n, we see that the lower 


bound is given by the quadratic function 


Fo tasoe) = (2) + | > (x, -2)") = a > (x - x) 


1<i<j<n 1<i<j<n 
Clearly F has a minimum, which we shall determine. 
The functiom F is differentiable and 


OF : . 
— =2)) (x, —x,;) t n(n — 234+ 1) = 2nx, - 20 x, + n(n - 21+ 1). 
OX; j#i j=l 


Setting S = L%_,x; and VF = 0, we find that 
2S —n(n — 21+ 1) 
x, = 

2n 


It is an easy calculation to see that x;,, — x; = 1 and, therefore, setting x; =/ 
yields a solution to these equations. Moreover, every other solution is obtained 
from this one by adding the same quantity ¢ to each x,. 

This means that there is an integral minimum for F, namely x, =i, i = 
1,2,...,n. Thus, a lower bound for the number of crossings in a cylindrical 
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drawing of K,, ,, is 


F(1,2,...2)= ¥ [i 


+ ¥ [nyt 


1l<i<j<n 2 


I 
1s 
a 
No 
nd 
——_ 
= 
| 
a 
wa” 
+ 
= 
tM 
a 
No 
a 
a 


1 
k n 
=*¥ (2) =e(3) 
Ka \2 3 
This is attainable; see Figure 3 for the case n = 4. The drawing of K,, obtained 
from this optimal cylindrical drawing of K,, ,, has the same number of crossings as 


the conjectured crossing number of K,,. 


3. ASYMPTOTICS. The following classical counting argument estimates the 
crossing number of K,,,, in terms of the crossing number of K,. Deleting in turn 
each vertex from a drawing of K,,, yields n + 1 different drawings of K,,. Each 
of these must have at least cr(K,,) crossings, so we estimate the crossing number of 
K,., by (n+ Der(K,). 

How many times do we count a given crossing? A given crossing from K,,, 
occurs in one of the drawings of K,, if the four vertices that are the ends of the 
edges involved in the crossing are all in the K, we pick. Given that we must have 
these four vertices, there are n — 4 vertices left to be picked from the remaining 
n — 3 vertices of the K,,,. Thus, the four vertices (and so the particular crossing) 
are in n — 3 of the K,. Thus, each crossing is counted n — 3 times and we have 
the estimate 


K ne K 

C. = C . 
r( n+1 ) n 3 r( n) 
This estimate is equivalent to 


(Ker) | (Kn) 
ai) (a) 


Therefore, the sequence cr(K,,)/ " is nondecreasing. Since it is bounded above 


by 1, it has a limit, say LC (for Limit of Complete graphs). 
2 
An entirely analogous argument shows that cr(K,, ,)/ (5) has a limit LB (for 


Limit of complete Bipartite graphs). The drawings of K,,,, such as the second 
drawing in Figure 1 show that LB < 1/4. 

It is easy to see that the conjectures as to the crossing numbers for K, and 
Kn.n imply that LC = 3/8 and LB = 1/4. We now show there is a relation 


between these limits. 
Theorem. LC > (3/2)LB. If LB = 1/4, then LC = 3/8. 


Proof: Let K,, be drawn with cr(K,,) crossings. Within this drawing, there are 
many different drawings of K,, ,. We need to estimate how many drawings of K,, ,, 
there are and how many of these contain a given crossing. 

We shall count ordered K,, ,,’s, 1.e., those where we first pick one set of m and 


then the other set of n. There are, evidently, (> such graphs. 
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Now consider a given crossing involving the edges ab and cd of K,,. One of a 
and b must be in the first set of n chosen, and similarly for c and d. Thus, there 
are 4 ways to distribute a, b,c, d into the first set of n chosen, if this crossing is to 
occur in the resulting K,, ,. There are 2n — 4 vertices left, of which n — 2 are to 


be put into the first set of n. Therefore, there are 4(7"- 34) different K,, ,,’s that 


contain the given crossing, and hence 


2n 
cr(K,,) = i! or( K ) 
2ny = 4 In —4 n,n/}* 
n—-2 
Divide both sides of this inequality by (?: and do some easy arithmetic to get 
cr( K,,) 3 cr(K,, ») 
entrees > ee creer ceenrnennenatinneanneneneee 
2n 2 n\° 
4 2 
Now taking the limit as 7 tends to infinity, we have the relation 


LC = (3/2) LB. 


It follows that if LB = 1/4, then LC = 3/8. Since we have previously noted 
LC < 3/8, it follows that if LB = 1/4, then LC = 3/8. a 


This theorem shows that the conjecture for cr(K,, ,,) implies the conjecture for 
cr(K,,,), at least asymptotically. Does the converse hold? 

Probably this cannot be derived by counting. The reason why the proof of the 
theorem works (as the proof shows!) is that any (almost) optimal drawing of K,, 
contains a drawing of K,, ,, that is economical in the sense that it has (almost) as 
few crossings as the conjectured value for cr(K,, ,,). 

For the converse, however, we do not know of a natural way to extend (almost) 
optimal drawings of K,, ,, to economical drawings of K,,,. The optimal cylindrical 
drawings of K,,,, have many more than cr(K,, ,,) crossings. 
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Carries, Combinatorics, 
and an Amazing Matrix 


John M. Holte 


This is the story of a serendipitous discovery. It began when I was investigating a 
mundane subject: carries in addition. To my surprise, a probabilistic perspective 
and some heavy-duty number crunching revealed a mathematical cache: an infinite 
collection of stochastic matrices in every dimension exhibiting an unusual symme- 
try and multifaceted combinatorial features. For each matrix II: 


¢ The eigenvalues are all positive and form a finite, decreasing, geometric 
sequence; furthermore, if we diagonalize II as U~'IIU = D, where the 
eigenvalues are arranged in decreasing order in the diagonal matrix D, then, 
aside from a constant of proportionality: 

¢ The entries in the row of U~' corresponding to the eigenvalue 1 are Eulerian 
numbers. 

¢ The entries in the row of U~! corresponding to the least eigenvalue are the 
entries in a row of Pascal’s triangle, but with alternating signs; the entries in 
the column of U corresponding to this eigenvalue are their reciprocals. 

¢ The entries in the first and last rows of U are respectively unsigned and signed 
Stirling numbers of the first kind. 


These unanticipated relationships first came to light when I explored the 
territory numerically, using Mathematica®. That started me on a project that 
cycled through phases of computer experimentation, conjecture, and rigorous 
mathematics. The mathematics involved included generating functions, recurrence 
relations, summation and matrix manipulation, combinatorial identities, and dis- 
crete probability—the techniques of “concrete mathematics” ([8]; see also [18)). 
This article is an invitation to aficionados of concrete mathematics to enjoy a 
guided tour of some wonderful sights. Along the way we will also point out several 
interesting side trips (exercises) for explorers. 


THE PROBLEM. When we add two long random base-ten (say) numbers, how 
often do we have a carry (of 1) from one column to the next? For example, 
consider the following addition of two fifty-digit numbers composed of digits taken 
from a table of random numbers: 


010011 00110 11100 01111 00001 00000 01101 11111 00000 1100 
24003 80475 19793 71578 52010 72216 15692 96689 80452 46312 
+16129 49245 21693 20946 60874 82351 32516 23823 30046 06870 
40133 29720 41486 92525 12885 54567 48209 20513 10498 53182 


We observe that we got a carry-out of 0 in 27 cases and a carry-out of 1 in 23 cases, 
or 54% and 46%, respectively. It would be natural to conjecture that in the long 
run, as the number of digits increases without bound, the relative frequencies 
would be 50%-50%. This is true, and a thorough treatment is given in [12, pp. 
262-263]. 


[February 
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What happens if we add three long random numbers? When I asked some 
faculty colleagues, they conjectured that carries of 0, 1, and 2 would be equally 
likely. In a seminar for students, one participant confidently asserted that there 


would be mostly 1’s. In the following sum of three 50-digit random numbers 


111011 10111 11000 10111 10210 11102 11122 01011 11210 2112 
05453 03060 83621 43443 07082 04401 15299 64642 73497 38426 
67711 70528 46700 00171 55077 11440 95932 91116 17255 19649 
76306 39287 31026 49339 70267 68885 98147 70311 43856 37376 

149471 12876 61347 92954 32426 84728 09380 26070 34608 95451 


we have 12 (24%) carries of 0, 31 (62%) carries of 1, and 7 (14%) carries of 2; 
perhaps the student was right. 

But are these empirical percentages good estimates of the long-run frequencies? 
And more generally, what is the long-run frequency of each possible carry value 
when we add any number of long numbers represented in any base? 


THE CARRIES PROCESS. Consider the addition of m random n-digit base-b 


numbers: 
Carries C, C,-1 C,_> vt C, C, 9 = 0 
Addends Xi nat Xin? “ X15 X14 X19 
+ Xmen Xm n-2 Xmn,2 Xin1 Xin,0 
Sum S., Si-1 S-2 owe S, S, So 


We assume that the {X, ,} are independent uniformly distributed random digits. 
The key to our analysis is this probabilistic insight: the carries form a finite Markov 
chain: 


Pr(Cy 1 = Cea ICy = Cg, ++, Cy = €;,Cy = 0) = Pr( Cy, = Cy 44 1C, = C;). 


This is true because the carry-out C,,, depends only on C, and, of course, the 
digits X, ,, X,,..., and X,, . 

What are the possible values of C,? Those who have experience with adding 
long columns of figures by hand know that the carry-out can be anything from 0 to 
m —1.' Thus the state space of the carries process (C,) is {0,1,...,m — 1}. 
Furthermore, it is possible to get from any state to any other state in 
[log,(7m — 1)|*+ 1 steps. A probabilist would say that this Markov chain is acyclic 
(aperiodic) and irreducible. 

Let II = [7; jl] denote the transition matrix: 


,; = Pr(carry-out = j|carry-in =i) whereO <i,j<m-—1. 
Because the states of the Markov chain are numbered 0,..., 7m — 1, we number 


the rows and columns of II in the same way. Now, to calculate 7;,, consider the 
base-b addition in the kth place: 


Cy =fejb sit X,, + 4X, .< (tb 


where 0 < X,,,...,X,,, < 5 — 1. Introducing the slack variable Y, we observe 


‘An interesting induction problem is to prove that the maximum possible value of the carry C;, is 


m—1—|(m — 1)/b* |. 
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that this is equivalent to 

Xp, to tXy, tYaH(jt+lb-1l-i=z (1) 
where 0 < X,,,.--,Xmyz,Y <b — 1. As Tucker [16, p. 311] notes concerning a 
similar problem, “By using generating functions to solve this problem, we [do] not 
need to know anything about the inclusion-exclusion complexities of this problem. 
Generating functions automatically [perform] the required combinatorial logic!” So 
now we invoke generating functions (and we gear up to the level of chapter 6 of 
[16] or chapter 2 of generatingfunctionology [18]). The number of integer solutions 
of (1) is the same as the coefficient of x7 in(1 +x +x? +--+: +x°"!)™*!, Because 
(1 _ x) ~(m+1) 


mt+i1 m+l1 


(L+xtx% ++ 4+x?71) =(1-x°) 
and 


aaah = [MEE aty’ 
and 


(1 —xy7"tD _ y [” + “a 


s>0 
the desired coefficient is 
r({m+1\{(m+2z-—rb 
2, ( 0 | r | m } 


Since r <z/b =j + 1—(i+ 1)/b if and only if r <j —|i/b|, we may summarize 
our result as follows. 


Theorem 1. The carries process (C,,) for the base-b addition of m random numbers is 


a finite Markov chain with state space {0,1,..., ™ — 1} and transition matrix 11 = [7;;] 
given by 
j- i/o} ; ; 
—m r{mt+1){m—-1-i1+(j+1-r)b 
r=0 r mM 


When b = 2, the number of bit-valued solutions of (1) is simply (” r ‘), SO 


Zz 


7;,= 2°" | yi ; 1 | in the binary case. 


Let’s look at some other examples. When b = 10 and m = 2,3, 4, then II is 


0.0715 0.5280 0.3795 0.0210 
0.55 0.45 oon oan Oe 0.0495 0.4840 0.4335 0.0330 
0.45 0.55 0120 0.660 0.220 0.0330 0.4335 0.4840 0.0495 


0.0210 0.3795 0.5280 0.0715 


The 0.0210 in the upper right corner, for example, signifies that, given a carry-in of 
0 to a column of 4 random decimal digits, the probability of a carry-out of 3 is 
0.0210. For a general base b we obtain the following formulas when m = 2, 3: 


1 fb+1 b-1 
= slp we and 

1 |b°+3b+2 467-4 b*?-3b+2 
= oa b? -1 4b +2 b? — 1 


b?-3b+2 4b°-4 b*+3b+2 
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CROSS-SYMMETRY. These examples reveal that II has an unusual sort of 
symmetry: it is radially symmetric about its center. A typical crossword puzzle grid 
has the same sort of symmetry. This symmetry is familiar to matrix theorists, who 
call it “centrosymmetry,” and to statisticians, who call it “cross-symmetry.” See [17] 
for a survey. 

Theorem 2. For i,j = 0,1,...,m — 1, we have ty_1~j,m—1-} 


Pr(C,,, =m—1—- JIC, =m—1-—i) =Pr(C,,, =JsIC, =i). 


= Tj jp» L.€., 


Proof: The cross-symmetry is not obvious from the formula in Theorem 1, so we 
turn to the probabilistic definition. Given that C, =i, we have C,,, =jJ if and 
only if 

Jb<itX,~tXy p+ +X, S (Jf t+ Ib - 1. (2) 
The {X;, ,} are independent random variables that are uniformly distributed on 
{0,1,...,b — 1}. Accordingly, the equation X, , = b — 1 — X,, , defines indepen- 
dent random variables that are also uniformly distributed on {0,1,...,b — 1}. Now 
if we negate the inequalities (2) and add mb — 1, we get 


(m-1-j+1)b-1l>m-1-i+X%,,+X%,,+ °° +X, ,2(m—-1-j)b, 


which is the condition for C,,, =m — 1—jgiventhat C, =m —1—i. = 


EIGENVALUES AND EIGENVECTORS AND SERENDIPITY. Let’s return to the 
carries problem. It is well known in Markov chain theory that our original question 
concerning the long-run relative frequencies of the carry values is answered by the 
stationary probability vector, i.e., the row vector v = (po,..., P,,-1) with nonnega- 
tive entries summing to 1 that satisfies vII = v. Thus, v is the left eigenvector of II 
associated with the eigenvalue 1. When I used Mathematica® to calculate some 
sample cases, out of curiosity I asked for more than v alone; I asked for the entire 
eigensystem. That’s when [| discovered the surprises hidden in the matrix II. 

Let’s look at the eigenvalues first. For b= 10 and m = 2,3,4,5, we find 
these eigenvalue sets: {1,0.1}, {1, 0.1, 0.01}, {1, 0.1, 0.01, 0.001}, and 
{1, 0.1, 0.01, 0.001, 0.0001}. For b = 2 and m = 5 we get {1,1/2, 1/4, 1/8, 1/16}. 


Conjecture 1. The eigenvalues of II are given by the geometric sequence 
1b71,...,b5%-», 


The eigenvectors for the two m = 5 cases (b = 10 and b = 2) turn out to be the 
same. Further numerical experimentation shows that the eigenvectors are indepen- 
dent of the base! 


Conjecture 2. The eigenvectors do not depend on b. 


What do these eigenvectors look like? If we assemble these (row) eigenvectors 
in a matrix V = [v,,] = [u,,(7)] for m = 2, 3, 4, and 5, we get: 
1 26 66 26 1 


1 4 a #B HP fy ly a 66 6 H10 -1 

1 1 1 3 3 -1 
1-2 1) J; 23 73 _j] fa -2 0 2 -1 
1-4 6 -4 1 
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Familiar sequences emerge at the bottom and top of V. 
Conjecture 3. The bottom row of V is proportional to a row of Pascal’s triangle, 
but with alternating signs. 


Conjecture 4. The top row of V is proportional to a row of Eulerian numbers. 


EULERIAN NUMBERS. The first few Eulerian numbers are listed in the follow- 


* 6) 6) 6m) 


0 1 

1 1 0 

2 1 1 0 

3 1 4 1 0 

4 1 11 11 1 0 

5 1 26 66 26 1 0 


It appears that Up (mm) = (" ) for j = 0,...,m — 1. The Eulerian numbers, first 
discussed by Euler (of course) in [6, pp. 485-487], [7, pp. 373-375], arise naturally 
in the study of random permutations; see [13, sect. 5.1.3] and the references there, 
[2], [4], [5, ch. 10], [14], and [15]. They satisfy the recurrence relation (see [8, sect. 
6.2]) 


=e y("e!)+ =m (2 oo | for integer n > 0 (4) 


with the boundary condition (0 )= 59,, the Kronecker delta. Using this relation 


and induction, one may deduce (as in [3]) 
W\_ ay 
E(t =n (5) 


Anticipating the verification of Conjecture 4, we normalize the Eulerian num- 
bers in accordance with (5) to get the stationary probabilities for the carries process: 


! /m . 
DP; = —(" } for] =0,...m—1. 
In particular, the long-run relative frequencies of carry values are (5,5) for m = 2 
and (2,4,z) for m = 3. We see that our empirical values came reasonably close. 
The explicit formula 
k 
(P= Dey ("t Yarra" (6) 
k 0 r 
was given by Euler himself. Notation for Eulerian numbers is not standardized; our 
notation conforms to that of [8]. 


THE EULERIAN RECURRENCE AND V. How can we find an explicit formula 
for V, the matrix whose rows are the left eigenvectors of II? If we are clever or 
lucky, we can guess the right answer and then verify it. 
Playing with the V cases in (3), we find that the Eulerian recurrence (4) holds 
also for every row of V, L.e., 
u;(m) = (j + lum -—1) + (m—j)u,;-,(m—-1) forO<i<m, (7) 


where we define uv,_,(m) = 0 and u,,,(m) = 0. This recurrence cannot give us the 
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last row of V(m) in terms of V(m — 1), because the latter matrix is short one row. 
But if Conjecture 3 is correct, we can paste in the last row of V(m) by the formula 


dn-,gcm) = (=) ("5"), 


A little calculation shows that "| = (- 1) ("5 ‘| has the near-Eulerian property 


. ir 
+ (m=) 


. ir 
+G+ 0)" 


" 
so (7) will be satisfied for i = m — 1 if we define u,,_, (m — 1) = (-1)/(" ‘), 
Or, 
;[ mm 
vn(m) = (—1)/(""), (8) 
This is an equation for the row below the bottom row of V. Now, to see the pattern 
that generalizes, we make the nonobvious observation [1, p. 822] that 
—ayif{m\) _ [mt] m+ 1 m+ 
(~1) | j | “| 0 1 j 
This equation, (8), Conjecture 4, and (6) give 
j 
r{mt+1 
acm) = D(-1)'(™ 


r=0 


te +(-1) 


Ju +1-r)’,and 


Upj(77) = ren" ¥ ‘Nu +1-r)", 


r 


SO we conjecture that 


ny eam = E(-y"(™E YG try (9) 


r 


Theorem 3. Let V = [u,,] be the m x m matrix given by (9) for 0 <i, j < m, and let 
D = diag{1,b~',...,b~"~ 1}. Then 


VIIV-! =D. 


Assuming Theorem 3, we have II = V~'DY; this is an equation that may be 
used to define II = II(b) for every complex b ¥ 0 and to prove II(ab) = II(a)II(b) 
for all nonzero complex numbers a and b. When a and b are bases—say a = 5 
and b = 2, whence ab = 10—this may be explained as follows. We may rewrite 
each base-ten digit T in the mixed-radix system having bases 5 and 2: T = 0 x 5 Xx 
2+ F X 2 + B; now when the carry-in 7 is applied to the binary column, it leads 
to an intermediate carry of k to the base-5 column with probability 7,,(2), which 
then generates a carry-out of j with probability 7,,(5), and so 7;,,(10) = 
Lar ;,(2)a, (5), ie., TIO) = IL(2)IKS). 


Proof: Concrete Mathematics Ahead. Here we’ll make heavy use of “concrete 
mathematics” techniques. First we observe that 


vij = ren" . ale +1—r)"" 
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is the convolution, or Cauchy product, of the sequences (in k) (( _ 1y*("} ‘| ) and 
(k™~*» evaluated at j + 1, ie., 


v,, = coefficient of x/*' in)? (-y'("t Ut km xk, 
k20 k>0 


Now we use the binomial theorem and the generating function of (k”), 


y. k"xk = fake ~x)', (10) 


k=0 
to get 


oo j+1: m+1 d\n" —1 
u;; = coefficient of x/*" in (1 — x) x (1 -x) 


= coefficient of x/ in x7'(1 — “n'a (1—x)™. 


Thus we obtain the generating function of the ith row of V: 


. d m-i 7 
uxt =x (1 =a) (| (1—x)™. (11) 
dx 
j20 
When i = 0, this generating function is x~'f,,(x), where f,,(x) is the Eulerian 
polynomial of degree m (see [14], [4]). 
We must show 

m-1 . 

> U;,7,; = b-'u,; fori, 7 = 0,1,...,m — 1. 

k=0 
By substituting, interchanging the order of summation (an entertaining exercise!), 
and simplifying, we get 


m-1 m—1 j-\k/b] . 
_m rj/m+1\{/m—-1-k+(jt+1-r)b 
» Vip WK = b » » (-1) | | ( ) Vik 
k=0 k=0 r=0 r m 
J (m-1)A(j+1-—r)b-1) 
= pom y y cy") 
r=0 k=0 r 


[TRG Tone), 
m 
(j+1-r)b-1 


u 


j 
= p-™ y(-n'{"* 1 
r=0 k=0 


r 


«(METER HTP), 
nt 


Let K = (j + 1 —r)b — 1. The inner sum, rk ("tk al is the convolution of 


the sequences (in k) ((™; ‘) ) and (u,,) evaluated at K. We know that 


a nid eee ee 


k>0 
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and we have the generating function of (v,,> in (11). Thus, the inner sum is equal 
to the coefficient of x* in 


d m-i d m-t 
(l-x) 7" xt -xyn"' (2 (l+x)7 -[xF] (1—x)™'" 
which, invoking (10), is 
K 
+K-—k mi ; m-i 
y(t Bu = (K+ 1 = (+1190) 
k=0 


Therefore, 


=o Se-n'(™* Nie a=noy™ 
= b" Ea (™* YG ern! =p, a 
=0 


r 


CONFIRMATION OF THE CONJECTURES AND MORE. Theorem 3 tells us 
that the rows of V are left eigenvectors of II corresponding to the eigenvalues 
1,b~*,...,b-“"~, so Conjectures 1 and 2 are true. Also the formula for v9, is 
the same as the explicit formula (6) for the Eulerian numbers, so Conjecture 4 is 
true. Finally, letting i = m — 1 in (11), we get 


| d _ m- 
Yoni, =a my" x (1 a xy = (x, 
’ dx 
j20 
so Conjecture 3 follows, by the binomial theorem. 

There are other patterns in V. It is easy to verify that the leftmost column of V is 
all 1’s. It is a little harder to show that the rightmost column has alternating + 1’s 
and —1’s, but it is a splendid opportunity to use the calculus of finite differences. 
Both exercises are left to the reader. 


THE RIGHT EIGENVECTOR MATRIX U: EMPIRICAL RESULTS. Let’s look at 
the right eigenvectors of the transition matrix II. As an alternative to direct 
computation of the eigenvectors, we may compute the inverse of the matrix V. 
Numerical experimentation reveals that, in order to get integer values, we should 
multiply by m!, so we let U = m!V~'. For m = 2,3,4,5, we find that U is: 


1 10 = 35 50 = 24 


1 3 27 fd & He 1 5 5 -5 -6 
1 1 1 2 -1 -2 
fo ap [208 -1h 1p 5 =p SY 0-5 O. 4]. 
1-3 2) |, -€ 4, _€ 1 -5 5 #5 ~6 


1 -10 35 —-S50 24 


Tantalizing patterns are already visible in these first few examples. Even though 
the columns are the eigenvectors, the top and bottom rows leap out at the 
combinatorial cognoscenti: They are Stirling numbers of the first kind! The pattern 
of the eigenvector in the last column may be exposed by dividing by (m — 1)!: 
reciprocals of binomial coefficients with alternating signs! Forming difference 
tables of the columns reveals more patterns: It appears that the jth difference of 
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the jth column is a constant—(—1)’m!/(m — j)!—which would make the /th 
column a polynomial of degree j in the row index 7. To summarize, for the matrix 
U = m!V“', we propose: 


Conjecture 5. Column / is a degree-j polynomial function of row index i. 
Conjecture 6. The entries in the final column are proportional to reciprocals of 
entries in a row of Pascal’s triangle with alternating signs. 

Conjecture 7. The top row consists of unsigned Stirling numbers of the first kind 
(in reverse order). 

Conjecture 8. The bottom row consists of signed Stirling numbers of the first kind 
(in reverse order). 


STIRLING NUMBERS OF THE FIRST KIND. The first few (unsigned) Stirling 
numbers of the first kind are as follows. 

nh 

4 


ee ee ee 


0 1 

1 0 1 

2 0 1 1 

3 0 2 3 1 

4 0 6 11 6 1 

5 0 24 50 35 10 1 


The Stirling number ; may be characterized combinatorially as the number of 


ways 1 objects can be arranged into k cycles, but for our purposes we characterize 
Stirling numbers algebraically. Rising factorial powers may be represented in terms 
of ordinary powers by means of unsigned Stirling numbers of the first kind: 


x" =x(x+1)(x+2)-(xt+n-I1)= ae (12) 
k 


falling factorial powers may be represented in terms of ordinary powers by means 
of signed Stirling numbers the first kind: 


x2 =x(x—-—1)(x -2)-- (x -n+ 1) = E(-ay | eae (13) 
k 
(See [8, sect. 6.1], [11, pp. 65-68], or [10, ch. 4].) 


THE RIGHT EIGENVECTORS. How can we find an explicit formula for the 
matrix U of right eigenvectors of II? One way would be to solve ITU = UD, which 
appears to be very difficult. My way was to find U by solving UV = mIlI. It turns 
out to take longer to solve this equation than it does to prove the answer is right, 
so let’s start with the answer. 


Theorem 4. Let V = [u,;] be the m X m matrix given by (9). Then mivot = [u;/] 
where 


uj = Ujj(m) = y rad | —-1-iy"? 


r=m—] 


for 0 <i, j < mand where 0° is taken to be 1. 


146 CARRIES, COMBINATORICS, AND AN AMAZING MATRIX [February 


Proof: We shall show that D7") u,v, j — ml6,,. We start with the standard trick of 
interchanging the order of summation: 


mo! mot J r{m+1 —k 
MS UiRYy; = Y ue E (-1)'( Ju+1-n 
k=0 k=0 


r=0 r 
J rj{m+1 — —k 
- be (" 1) Daaerans 
r=0 k=0 


Here we rewrite the inner sum as follows (note that 0 = ( in the second line and 
the interchange trick is used again in the third line): 


m- i 
yy un (i +1- ry" 
k=0 
= Y us mei t 1-7)" 
k=1 
_¢ — _4y7 | @ S —~1—j)k; 1—r) 
EEC || om 1a G10 
_ ” _4y\m-s in - R) _ _j s-k,. _ + k 
Bor ft]jenaraen 
= > (yom —1-i+j+1-r) [by the binomial theorem] 
s=0 s 
=(m-itj-r)” [by (13)] 
_ m—-itj-r 
 mi| mn | 
Therefore, 


mt+i1 
r 


mein) 


m-1 j 

, 
> Uj,Vyj =m! ) (-1) m 
k=0 r=0 


Note that ("~'*/""}=0if0<m—-itj-r<mie,j-i<r<m-itjlt 


0 <j <i<™m, every term in the last equation is 0; if 0 <i =j < m, only the 
r = 0 term is nonzero (it is m!); if 0 < i <j <_m, we may add zero terms to get 


mt+1 


r 1 -itj- a. 
ys (-1)' |" r moe " = A”*! (polynomial in r of degree m) = 0. 
r=0 r 
Therefore, 
m-1 
k=0 


CONFIRMATION OF THE CONJECTURES. The formula for U is complicated 
enough that it still takes some work to verify our conjectures. Conjectures 5, 6, and 
8 are left as an exercises, and we turn to Conjecture 7, which claims that the top 
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row of U contains unsigned Stirling numbers of the first kind: For j = 0,...,m — 1, 


Ug(™m) = Ln” if 


This conjecture neatly reduces to the two basic identities relating Stirling numbers 


of the first kind to factorial powers. By Theorem 4, for 1 = 1,...,m, 
a m—-r| py r r-—n 
dom-ncm) = DCN — 1) 
r=n 
Thus, Conjecture 7 is equivalent to the identity 
a m—r| Wy r r-n nt 
Eom [afilmoe- 


Fix m2>n, and let a, denote the left side of (14). Switching the order of 
summation, we find that the generating function of (a,,) is 


Eaet= Ey" |] E ("arom = 1 


n>0 r=0 rl n=0 


= ¥(-1)"" |x +m-—1) _ [by the binomial theorem] 
r=0 


(x+m—1)” [by (13)] 
=(x+m—1)(x+m—2)-(x) 
= x”, 
which is the generating function of (4 by (12). Therefore, a, = ||. 1.€., 
identity (14) holds, so Conjecture 7 is true. 


FURTHER CONSEQUENCES AND EXPLORATIONS. Many people are fasci- 
nated by combinatorial identities like (14), and there are many to be found in the 
context of the carries transition matrix. For example, the empirical recurrence 
identity for V, (7), is indeed true, and provides a family of arrays satisfying the 
Eulerian recurrence (4). Other identities, including familiar ones, may be extracted 
from the matrix equations IIU = UD, VII = DV, UV = mlI, and VU = mI!I. Here 
is just one illustration: Set i = 0 and j =m —1> 0 in Y,v,u,, = m!6,, and get 


Ecv'(gy/[";)-0 


Besides identities, we have seen geometric sequences, binomial coefficients, 
Eulerian numbers, and Stirling numbers of the first kind. What about other special 
numbers, like Stirling numbers of the second kind? Are they lurking nearby? Yes, 
indeed. Stirling numbers of the second kind crop up naturally in formulas for the 
factorial moments of the stationary probability distribution of II; alternatively, the 
nth factorial moment is exactly the generalized Bernoulli number B“~”? (see [5, 
chapter 15]). A different sort of result is that the stationary probabilities are 
asymptotically normally distributed [5, pp. 150-154]. I hope some readers are 
inspired to discover other interesting connections. 

Going beyond these kinds of propositions, we may put the matrix II in a larger 
context: It is the x = 1 case of the matrix [7, jx'], which plays a central role in the 
analysis of the asymptotic prime-power divisibility of multinomial coefficients [9]. 
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Our tour has revealed a combinatorial richness hidden in the matrix II. But it 
leaves unanswered the question, Why are all these combinatorially significant 
relationships connected with the carries matrix? 


ACKNOWLEDGMENTS, I thank Jennifer Galovich and Paul Fjelstad for pointing out the connections 
with Eulerian numbers, and I thank Paul Fjelstad, Stephen Hilding, Ron Rietz, and Herbert Wilf for 
their comments on earlier drafts of this paper. 
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A New Look at Euler’s Theorem 
for Polyhedra: A Comment 


Walter Nef 


In their interesting paper [6] the authors, Branko Griinbaum and Geoffrey 
Shephard mention (page 125/126) “... the work of Nef [20-25], which our note 
parallels to some extent. Nef’s definition of polyhedral sets differs from ours, but is 
equivalent to it; his approach to the Euler characteristic is the same as ours.” Actually, 
this parallelism concerns a major part of the article and comes partly into conflict 
with my own work in the field. The main reason is the authors’ inexplicable 
statement (p. 126) “Nef [1978] defines as faces of a polyhedral set P any family of 
disjoint, relatively open convex sets that is a dissection of P. ... However, these ‘faces’ 
are, in general, not uniquely determined, and have only a limited geometric significance.” 
In fact my definition is the one in [8, p. 6.2], or in [1, p. 98]. According to this 
definition the faces of a polyhedron turn out to be the relative interiors of the 
faces in the “intuitive” sense. They are uniquely determined and have a clear 
geometric significance. (Unfortunately, in [1] a typing error slipped in: On p. 98 in 
formula (06) U, should be replaced by P MN U,, and U by PN U). 

In a thorough discussion, several further points would have to be critically 
looked at. I confine myself to two of them: 

On page 122, Grinbaum and Shephard present their Theorem 2* concerning 
the Euler characteristic of (not necessarily bounded) closed convex polyhedra. 
They overlook that I have published the same result previously in [9, Satz 4, p. 68]. 

On page 117, the authors define the Euler characteristic y(P) (in the same way 
as I have done in [10, Satz 2, pp. 44—45]): For a cell C (a nonempty relatively open 
convex polyhedron) we put x(C) = (—1)%™°, furthermore, ,(@) = 0. If P= 
UcecC represents P as a finite disjoint union of cells, then y(P) = Ucec x(C). 
The definition is followed by three theorems, the first of which asserts that y(P) 
does not depend on the partition of P into cells. (For a proof of this Theorem see 
[10, Satz 2, pp. 44-45]). The second theorem states that y(P) = 1 if P is bounded, 
closed, and convex. The proofs are omitted “since they follow the usual techniques.” 
From the discovery of his theorem by Euler (in 1750) more than 200 years were 
needed to find a complete proof ((5, p. 134], [7, p. 94], [2]); for an extended 
discussion of the historical development see [6, pp. 122-127]. It is therefore not 
evident what is meant by “the usual techniques,” at least in the case of the second 
theorem, which is an essential part of the theorem of Euler-Schlafli (Leonhard 
Euler [3] and [4], and the extension to higher dimensions in 1850 by Ludwig 
Schlafli [12]). In [11] I have published a particularly short and simple proof which I 
repeat here: First of all, every closed polyhedron P is the disjoint union of its faces 
(without ext P) [8, Satz 3, p. 6.7]. If P is closed and convex, the faces are cells [8, 
Satz 8, p. 7.12]. Therefore y(P) = X7_,(-1)'f,, where n = dim P and f, is the 
number of faces of dimension i (with f, = 1 for the relative interior of P). 

Now let P be bounded, closed, and convex, and let ® denote the family of all 
faces F of P with dim F < n. We choose an arbitrary point z € relint P. For each 
face F © ® we define F* := {x =z + AF — z) with 0 < A < 1}, so F® is the 
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union of all open line-segments joining z with a point of F. It is intuitively clear 
and is easy to prove that the F* too are cells, that dim F* = dim F + 1, and that 
all F, F* (F € ®), together with {z} form a partition of P (Figure 1). Therefore 


DL (x(F) + x(F*)) + x({z}) = 


Fe® 


x({z}) =(-D° = 1, 


Y(-v'fi= xP) 


which is the theorem of Euler-Schlafli. 


Figure 1 
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NOTES 


Edited by Jimmie D. Lawson 


The Hopping Hoop 


Tadashi F. Tokieda 


‘A weight is attached to a point of a rough weightless hoop, which then rolls in a 
vertical plane, starting near the position of unstable equilibrium. What happens, 
and is it intuitive?’ 

The problem just quoted is from Littlewood’s delightful Miscellany [1, p.37]. As 
is often the case in phenomena involving a no-slip constraint (‘rough’ indicates that 
the hoop is to roll without slipping), what happens is rather unintuitive. Declares 
Littlewood: ‘The hoop lifts off the ground when the radius vector to the weight 
becomes horizontal.’ 


OC CO 6 


Perhaps the most ingenuous approach to proving that the hoop indeed hops is 
to calculate the force that the hoop exerts against the floor at the point of contact, 
and to check that it changes to negative after the hoop has rolled 7/2. The 
approach works, but hardly explains why the hoop should hop at all. 

It is more pleasant to reason as follows. If the hoop is always kept in contact 
with the floor, then the weight (call it m) travels along a cycloid. Now imagine that, 
when m comes to a certain point P on the cycloid, the hoop suddenly disappears. 
Then m would continue to free-fall along a parabola tangent to the cycloid at P. 
If, however, the hoop fails to disappear (as it usually does), then 


(1) m presses the hoop down as long as the imagined parabola at P departs 
below the cycloid; 

(2) m pulls the hoop up, and so the hoop hops, as soon as the imagined 
parabola at P departs above the cycloid. 


By construction the parabola and the cycloid have the same zeroth and first terms 
in their Taylor series around P. Therefore departure below or above will be 
decided by an inequality between their second derivatives. 

Let us determine the earliest P for which (2) occurs. We generalize the problem 
somewhat by taking the liberty of shoving m off the point of unstable equilibrium 
with initial velocity v, (in Littlewood’s original formulation, vp = 0). 

In coordinates as shown, with g and R denoting the gravitational acceleration 
and the radius of the hoop, the conservation of energy dictates 


m m 
5 (# + y*) + mgy = 5 40 + mg2R. 
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Along the cycloid 


x(t) = RO(t) + Rsin 0(t) 
y(t) =R+ Roos 6(t) 


we have 
m . 7 2 m , 
Zz (R6+ Roos 6:0) +(—Rsin 6-0) | + mg(R + Roos 6) = + mg2R, 


which unravels to 
4e@R sin*( 6/2) + uG 
~ 4R* cos?( 6/2) 


This relation enables us to express the derivatives of y(t) in terms of 6: 


42 


y = —Rsin 0-6 


—sin( 6/2) V/4gR sin?( 0/2) + v2, 
2 


—2g sin*(0/2) — “0 
6 AR’ 
As remarked earlier, m pulls the hoop up, thereby making it hop, as soon as the 
second derivative of the parabola exceeds that of the cycloid; i.e. the hop occurs at 


minimal 6 such that, —g > y(0(t)), or 


j 


> \1/2 
; Uo 
sin(@/2) = | 


1 

v2 [i 4e2R 
In particular, for uy = 0 the hoop hops at 6 = 7/2, as claimed. We also observe 
that for v) = y4gR the hoop ‘glides’ immediately without rolling. Naturally, this 
escape velocity ought to be larger than the escape velocity ¥2gR for a circle: since 
the cycloid has smaller curvature than the circle does at the peak 6 = 0, it is 
harder to escape from the cycloid than from the circle. 

The author thanks F. Almgren for collaborating on an experiment. We taped a 
battery on a hula-hoop and rolled it down the twelfth-floor hallway in Fine Hall; it 
actually hopped. 
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Principal Ideal Domains 
Are Almost Euclidean 


John Greene 


In most undergraduate level books on abstract algebra, it is shown that every 
Euclidean domain (ED) is a principal ideal domain (PID) and every principal ideal 
domain is a unique factorization domain (UFD). We thus have a set of implica- 
tions: ED = PID => UFD. Most (but not all!) books mention that neither con- 
verse is true. But while it is very easy to show that Z[x] is an example of a UFD 
that is not a PID, an example of a PID that is not a ED is harder to come by. In 
[2], Campoli gives an easy proof that Z[{] has the desired properties, where 
¢=(-—-1+vV—19)/2, by showing that, in his words, Z[Z] is “almost Euclidean.” 
In this note, we show that Campoli’s “almost Euclidean” condition is, in fact, 
equivalent to the PID condition. 


Definition. An integral domain D is said to be almost Euclidean if there is a 
function d: D — Z* U{0} (called an almost Euclidean function) such that 


1) dO) =0, d(a) > Oifa +0, 
2) If b #0, then d(ab) = d(a) for all a € D, 
3) for any a,b € D, if b # 0 then either 
i) a =bq for some g € D or 
li) O < d(ax + by) < d(b) for some x, y € D. 


Our functions @ in this paper will satisfy the stronger condition (2’) that for all a, b 
in D, d(ab) = d(a)d(b), from which (2) follows trivially. 


Our main result is the following: 


Theorem 1. An integral domain D is a principal ideal domain if and only if it is 
almost Euclidean. 


Proof: Campoli [2] proved that if a ring is almost Euclidean, it is a PID. For 
completeness, we repeat the proof here. Let D be almost Euclidean, and let J be 
a nonzero ideal in D. Among the elements x € J, let b be an element with a 
minimal positive value for d(x). Given a € J, for any x, y € D, ax + by is in I. By 
definition of b, it cannot be that 0 < d(ax + by) < d(b), so the second condition, 
a = bq must hold for some g € D. Thus, J = (b). 
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Now suppose that D is a PID. Then D is a UFD, so we may define the function 
d as follows: Let d(0) = 0, and for any a # 0, if a = ep, p, -:: p,, where e« is a 
unit and p,,..., p, are irreducibles, let d(a) = 2”. Since d(ab) = d(a)d(b), it is 
clear that d satisfies (1) and (2) of the definition. So let a, b € D, with b # 0. Let 
I = {ax + by|x, y € D}. Since J is an ideal in D, J = (r) for some r € D with 
r#0. If a=bq for some g €D, then J = (b>. Otherwise, J # (b). Since 
b €1,b =xr for some x € D, so d(b) => d(r). Since 1 # (b), x is not a unit. 
Thus, d(x) > 1, so d(r) < d(b). If r=x,a + y,b, then we have 0 < d(r) < d(b), 
and condition (3) is satisfied by d. | 


Examples of Euclidean domains in abstract algebra texts are almost always of 
the form D = F[x], where F is a field or the ring of integers in Q[yd ] for various 
small integer values of d. In the latter case, these books introduce the norm of an 
element of this ring and use its absolute value as a Euclidean function. In general, 
if F is an algebraic number field (a finite extension of Q), then F can be viewed as 
a finite dimensional vector space over Q. If a € F, then the map 7\(x) = ax is 
obviously a Q-linear transformation from F to F. The norm of a, N(a), is defined 
to be the determinant of this transformation. The norm has the following proper- 
ties: 


1) N(ab) = N(a)N(b) for all a,b € F, 

2) N(a) = 0 if and only if a = 0, 

3) if a is an algebraic integer, then N(a) &€ Z, 

4) an algebraic integer a is a unit if and only if N(a) = +1, 


Properties (1) and (2) are elementary properties of determinants, property (3) is 
mentioned in [5, p. 175], and property (4) is an easy consequence of (1), (2), and 
(3). 


Theorem 2. Jf D is the set of integers in an algebraic number field, and if D is a 
principal ideal domain, then the absolute value of the norm satisfies the conditions of 
an almost Euclidean function. 


Proof: The properties of the function d in Theorem 1 that were used in the proof 
were: 


1) d(ab) = d(a)d(b) 
2) if a € D, then d(a) = 1 if and only if a is a unit. 


Since the absolute value of the norm also has these properties, the proof follows as 
in Theorem 1. Thus, given a, b € D, with b ¥ 0, let J = {ax + by|x, y € D} = Cr). 
If a = bq for some g € D, then J = (b>. Otherwise, 0 < |N(r)| < |N(b)I, since 
b = xr for some nonunit x € D. a 


If D is the ring of integers in some finite extension F of Q, we may now check 
if D is a principal ideal domain by checking whether or not D is almost Euclidean 
with respect to the absolute value of the norm. Thus, number fields are quite 
special. Another example of this is the following: In a number field, D is a UFD if 
and only if D is a PID [6, page 146]. Campoli [2] used the fact that Z[ 2] with 
¢=(-—1+y7-19)/2 is almost Euclidean to show that this ring is a PID. His 
techniques can be easily extended tc show that this remains true if — 19 is replaced 
by —43 or —163. In fact, with a little work it is possible to prove the famous result 
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[1, p. 137]: The ring of integers in Q(Vv1 — 4d) where d > O is a PID if and only if 
the polynomial x* + x + d is prime for all integers x with 0 <x <d — 2. 

One final comment: If D is an almost Euclidean subring of a number field, 
Theorem 2 tells us that we may use the absolute value of the norm as a near 
Euclidean function. Suppose that D is actually Euclidean. Will the absolute value 
of the norm serve as a Euclidean function? It is interesting to note that Hardy and 
Wright [4, p. 212] define a Euclidean domain not in the usual way but explicitly 
using the norm as the Euclidean function. However, the answer to the question is 
that the norm may not work. In fact, it was shown in [3] that Z[{], with {= 
(1 + ¥69)/2 is an example of a ring which is Euclidean, but not with respect to 
the absolute value of the norm. 


ACKNOWLEDGMENTS. I would like to thank Joe Gallian and the reviewer for many helpful 
comments, and thank the members of the usenet newsgroup sci.math, especially Henry Cohn, for the 
reference to a ring that is Euclidean but not norm—Euclidean. 
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A Colorful Determinantal Identity, 
a Conjecture of Rota, and Latin Squares 


Shmuel Onn 


1. Rota’s Colorful Conjecture and the Latin Square Conjecture. The following 
conjecture in combinatorial linear algebra is due to Gian-Carlo Rota. 


Rota’s Colorful Conjecture. Let 'W, ...,_W be bases of an n-dimensional vector 
space. Then their multiset union can be repartitioned into bases ‘U,...,"U such 
that |'U A’W| = 1 for all i, /. 


Regarding the vectors in each ‘W as colored in color i, the newly sought bases 
are colorful, namely contain one vector of each color. So Rota’s Colorful Conjec- 
ture is that any n colored bases of an n-dimensional vector space can be 
repartitioned into n colorful bases. 

A Latin square of order n is an n by n matrix L = (L/) in which each row and 
each column is a permutation of {1,..., m}. More precisely, there are permutations 
O1,...,0, and 7,..., 7, such that L/ = o,(j) = 7;(i) for all i, j. The sign of the 
Latin square is defined as the product of all signs of its row and column 
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permutations 
sgn(L) = I Tsen( g;) + sgn(7;). 
l= 


A Latin square is even if its sign is positive, and odd otherwise. Let /(n) be the 
number of even Latin squares of order n minus the number of odd ones. It is easy 
to see that /(n) = 0 for all odd n. The following conjecture is due to Noga Alon 
and Michael Tarsi (cf. [1]). 


Latin Square Conjecture. The number of even Latin squares of order m minus the 
number of odd ones satisfies /(n) # 0 for all even n. 


This conjecture has been recently proved by Drisko [4] for all n = p + 1 where 
p is a prime. The exact values of /(m) are known only for n < 8 [7]. It is plausible 
that /(n) is in fact always nonnegative. 

In this note we establish an identity, the Colorful Determinantal Identity, which 
links the two conjectures. It shows that for any n, the Latin Square Conjecture 
implies Rota’s Colorful Conjecture. In particular, Rota’s Colorful Conjecture is 
true for any m = p + 1 where p is a prime. To compactly express the identity, let 
S, be the symmetric group of permutations on {1,...,} and denote by +” the 
collection of n-tuples p = (p,,..., p,) of permutations. For p ©.” let sgn( p) = 
TI7., sgn( p,). For a matrix W let W’ be its j-th column vector. The proof of the 
following theorem is given in Section 2. 


Theorem 1 (Colorful Determinantal Identity). Let 'W,...,”"W be n square matrices 
of order n over an arbitrary field. Then 


~ sgn( p) [] det((w®,...,"W) = 1(n) + TI det). 
pe HF” i=] j=l 


Note that for each p €.%”, the tuples CW", ...,"W), i = 1,...,, which 
appear in the left hand side of the identity, give a colorful repartition of the 
multiset of column vectors of the ’W. 


Corollary 1. For any even n the Latin Square Conjecture implies Rota’s Colorful 
Conjecture over any field of characteristic which does not divide I(n) (in particular 
characteristic zero). 


Proof of Corollary 1 from Theorem 1. Suppose that I(n) # 0 and let 'W,...,"W be 
given bases, i.e. nonsingular matrices of order n over a field satisfying the 
hypothesis. Then the right hand side of the Colorful Determinantal Identity is 
nonzero, Therefore, on the left hand side, there must exist a nonzero summand 


rh 
[]det((we®,...,”W em), and so the sets 'U = (SW, ... ,"Wee}, 
i= 


i=1,...,n, 
give the desired repartition into colorful bases. = 


Corollary 1 had been independently obtained by Huang and Rota, but their 
derivation is quite complex and indirect and involves an intermediate conjecture 
on Rota on a certain straightening coefficient in the so-called supersymmetric 
bracket algebra [6]. 
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Rota’s Colorful Conjecture has a natural generalization to matroids [6] which 
had been verified only for n = 3 [3]. Another generalization of Rota’s conjecture is 
a conjecture of Jeff Kahn (cf. [6]) that concerns n* bases, for which we have 
derived a determinantal identity which is reminiscent of Theorem 1. The special 
case of Kahn’s conjecture in which the vectors are in general position is known as 
the Dinitz problem and was recently settled affirmatively by Galvin [5]. We refer 
the reader to [2] for some related algorithmic problems and a discussion of their 
computational complexity. 


2. The Colorful Determinantal Identity 


Theorem 1 (Colorful Determinantal Identity). Let 'W,...,”W be n square matrices 
of order n over an arbitrary field. Then 


Y sgn( p) TI det(w,...,"W) = In) - TI det). 
pe" t=] j=l 


Proof: For a matrix W let W’ be its j-th column vector as before, and let W, 
denote its i-th row vector and W, its (i, /)-th entry. Given nm square matrices 
'W, ...5 W, of size n, define the following polynomial in their entries: 


A= } sgn(o)sgn( p) 1 WEY. 
a,pEex” i, j=l 
For each p and o in #” define 


AP? = ) sgn(o) TL weg} = I L sgn(o, DTD mes 


cE HK" i=] o,ES,, 


T det(*W., wees "W Pld) , 


A, = > sen( p) I WED = 0 > sen( p;) ee 
per" i, j=l 1 peS,, i=l 
= TTeet( Wecjyr++sWo,ciy)- 
Now A, is nonzero only for 0 = (a,,...,0,) € “” for which there exists an 
element a = (7),..., 7,) in *” such that o,(j) = 7,(Z) for all i, 7, in which case 


= Taei( reser Wen) = sen(7) T] det('W,,...,/W,) 


= sgn(77) 1 det(’W). 


Each o €.¥" for which such 7 €.%” exists gives a Latin square L via L! = o((j) 
= (i), whose sign is given by sgn(L) = sgn(a )sgn(7r). Denote by & the set of all 
Latin squares of order n. Then 


> sgn(p)A?=A= PO sgn(o)A,= son(L) [J det('W) 
Lef j=l 


pESx" cES" 
=I(n) [| det(’W), 
j=l 
which is precisely the Colorful Determinantal Identity. | 
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Very Semisimple Modules 


W. K. Nicholson 


Throughout this note R will denote an associative ring with unity. A left module 
M over R is called semisimple if it is a (possibly infinite) sum of simple submod- 
ules, equivalently [1, p.437], if every submodule of M is a direct summand of M. 
The theory of these modules is well known and is basic to the study of noncommu- 
tative rings. We call a module M very semisimple if every principal submodule Rm, 
0 #m & M, is simple. It is clear that every such module is semisimple; in this note 
we characterize when the converse is true. 

Every simple module is very semisimple, as is every vector space over a field (or 
a division ring). If p is a prime in the ring Z of integers, and if Z, = Z/pZ denotes 
the ring of integers modulo p, the direct sum Z‘” of |J| copies of Z, is very 
semisimple as 4 Z-module because each nonzero element has order p. 

On the other hand, M = Z, ® Z, is semisimple as a Z-module but it is not very 
semisimple because M = Zm is not simple where m = (1 + 2Z,1 + 3Z). As the 
following proposition shows, the reason is that Z, and Z, are not isomorphic. 
Recall that a semisimple module M is called homogeneous if any two simple 
submodules of M are isomorphic. We are going to show that every very semisimple 
module is homogeneous, and the following characterization will be needed. 


Lemma 1. A semisimple left R-module M is very semisimple if and only if Rm, + m,) 
is simple whenever Rm, and Rm, are simple where m, and m, are in M and 
m, +m, # 0. 


Proof: If 0#m EM let me€ Rx, ®©-:: ® Rx, where each Rx; is simple [1, 
p. 437]. Write m = m, + -*: +m, where m,; € Rx; for each i. We may assume that 
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each m,; # 0, so Rm, = Rx; is simple. Hence R(m, + m,) is simple by hypothesis 
(m,+m,#0 because Rm, © Rm, is direct). Again, (m, +m,)+m, #0 so 
R(m, +m, + m,) is simple. Continue to conclude that Rm = R(m, + ++: +m,) is 
simple. | 


Proposition. Every very semisimple module is homogeneous. 


Proof: If Rm and Rm' are simple where m and m’ are in M, we must show that 
Rm = Rm'. We may assume that Rm # Rm'. Define a: R(m + m') > Rm and 
B: Rim + m') > Rm' by a[r(m + m')] =1m and B[r(m + m')] = rm'. These are 
well defined because Rm ® Rm’ is direct (Rm # Rm'), so both are onto. But 
R(m + m') is simple by Lemma 1, so a and 8 are both isomorphisms, as required. 

a 


We hasten to note that not every homogeneous semisimple module is very 
semisimple, and the theorem below will tell us exactly which ones are. 

If X is a subset of a left R-module, denote its annihilator by /_X) = {r © Rin 
= (0 for all x € X}. We abbreviate /({m}) = /(m). For convenience, we write 4< R 
to mean that A is a two-sided ideal of the ring R. Our general characterization of 
very semisimple modules depends on the following fact about simple modules. 


Lemma 2. The following conditions are equivalent for a simple left R-module M: 


(1) l(m)<R for everym € M. 

(2) I(m)<R for some 0 #meéM. 

(3) M =R/A where A < Rand A is maximal as a left ideal of R. 
(4) Im) = I(m') for allm #0 and m' # 0 in M. 


Proof: Clearly (1) = (2), and (2) = (3) using A = /(m). Assume (3), and let o 
M — R/A be an R-isomorphism where A is as in (3). Given 0 # m € M, write 
om =b+A,beéER. Then I(m) = I(om) = {r|rb € A}. Because A is an ideal, we 
have A C/l(m), and so A = /(m) because A is a maximal left ideal. This proves 
that (3) = (4). 

Finally, if (4) holds, let m € M; we must show that /(m)< R. This is clear if 
m = 0. If m # O write A = [(m) so that A is a left ideal of R. We must show that 
Ar CA for all r& R. This is clear if r € A; if r€ A write m' = rm # 0. Then 
A = I(m') by (4) so 0 = Am' = Arm. Thus Ar CA and (1) follows. a 


Lemma 3. The following conditions are equivalent for a nonzero, homogeneous, 
semisimple left R-module M: 


(1) lm) = I(m') for all m # 0 and m' # 0 in M. 

(2) I(m)< R for any m € M with Rm simple. 

(3) I(m)< R for some m € M with Rm simple. 

(4) I(M) is maximal as a left ideal of R. 

(5) M =(R/A) for some set I and A < R with A maximal as a left ideal. 


In this case M is very semisimple, and if A = I(M) then A = R/A is a division ring 
and end(,M) = end(,M) where M is a left A-space via (r + A)m = 1m forallr eR 
andm &€ M. 
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Proof: 

(1) = (2). Let Rm be simple, m © M. Then /(m) = I(m') for all 0 # m' © Rm 
by (1), so 1(m)< R by Lemma 2. 

(2) = (3). This is clear. 

(3) = (4). By (3) fix m, € M with Rm, simple and [(m,)< R. Write A = l(m,). 


Claim. I(m') = A for all m' © M with Rm’ simple. 


Proof: Rm, = Rm' because M is homogeneous. If a: Rm, — Rm' is an isomor- 
phism, then /(am,) = [(m,) =A<R. As am, € Rm' Lemma 2 (applied to the 
simple module Rm‘) shows that /(om,) = I(m'). This proves the Claim. 

Now let 0 #m © M. Then m &€ Rx, ® --- 6 Rx, where each Rx; is simple, so 
assume m = m, + --- +m, where Rm, = Rx, for each 1. Thus Am, = 0 for each 1 
by the Claim, so Am = 0. This means A C/(M), and so A = 1(M) because A isa 
maximal left ideal. This proves (4). 

(4) = (5). Take A =1(M). Then A Cl(m) #R for all 0 #m <M. Hence 
A =I(m) by (4), so Rm = R/A. Now (5) follows because M is a direct sum of 
simple submodules. 

(5) = (1). Let 0 # m © M; we must show that Rm is simple. By (5) we may 
assume that M = (R/A), so Am = 0 because A is an ideal. Thus A C/(m), so 
A =I(m) because A is a maximal left ideal. This proves (1). 

Finally, note that Rm = R/A is simple in (5) => (1), so M is very semisimple. If 
A =I(M) and A = R/A, then A is a division ring by (4) because 0 is a maximal 
left ideal. Since (r+ A)m=rm is a well defined A-action on M, the last 
statement follows. = 


Note that the last statement in Lemma 3 actually holds for any homogeneous, 
semisimple module except that the ring A could be any left primitive ring rather 
than a division ring. 


Theorem. Let M be a nonzero homogeneous, semisimple left R-module. Then M is 
very semisimple if and only if either M is simple or M is non-simple and satisfies the 
conditions in Lemma 3. 


Proof: By Lemma 3, it remains to show that if M is very semisimple but not simple 
then /(m) <| R for every m € M with Rm simple. By Lemma 2 it suffices to show 
that /(m') = I(m) for all 0 # m' © Rm. Since M is not simple choose k € M with 
k & Rm. Then Rk is simple because M is very semisimple, so Rm @ Rk is direct. 
This implies that (m+ k) Cl(m) 2M I(k). Since R(m + k) is simple it follows that 
I(m) = I(m + k) = I(k). Similarly 1(m') = I(k). | 


Corollary 1. The following conditions on a ring R are equivalent: 


(1) Every maximal left ideal of R is two-sided. 
(2) Every homogeneous, semisimple left R-module is very semisimple. 
(3) Every homogeneous, semisimple left R-module is length 2 is very semisimple. 


Proof: Lemma 3 gives (1) = (2) and (2) = (3) is clear. If A is a maximal left ideal 
of R, then M=R/A ® R/A is very semisimple by (3). Let r © R and consider 
m=(1+A,r+A) in M. It suffices to show that Am = 0 (then 0 = (4, ar + A) 
for all a € A). But if Am # 0 then Am = Rm because Rm is simple (M is very 
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semisimple). Consequently m = am for some a € A. This means (1 + A,r +A) = 
(A, ar + A), so 1 € A, a contradiction. Hence (3) = (1). | 


The conditions in Corollary 1 are satisfied for every commutative ring and for 
every local ring (a ring is called local if the Jacobson radical is the only maximal 


left (or right) ideal). They also hold for the ring R = 5 q of upper triangular 


0 A 


matrices over a division ring A. In fact the only maximal left ideals of R are E | 


0 Al’ 

If M is any module and K is any simple submodule, the sum of all submodules 
of M that are isomorphic to K is a submodule called the homogeneous component 
of M generated by K. It is well known that the socle of M (that is the sum of all 
simple submodules of M) is the direct sum of the various homogeneous compo- 
nents. If A< R, write r,(A) = {m € M|Am = Of. 


and E 4! and both are two-sided. 


Corollary 2. Let H # 0 be a non-simple homogeneous component of a left R-module 
M. The following are equivalent: 


(1) H is very semisimple. 
(2) H =r,,(A) for some A <R such that A is maximal as a left ideal of R. 


In this case, A = 1(H) and Rm = R/I(A) for all0 #m € H. 


Proof: (1) = (2). Choose any m, € H with Rm, simple, and write A = [(m,). 
Then A is a maximal left ideal of R, and A<R by the Theorem. Moreover, 
(1) of Lemma 3 shows that A = /(m) for all 0 # m € H, so H Cr,,(A). But if 
0#x €r,(A) then A C/(x). Hence A = I(x) by the maximality of A, and so 
Rx = R/A = Rm. Thus x € H because H is a homogeneous component of M. 

(2) = (1). If WH =7,,(A) as in (2), then A C1). Thus A Cl(m) # R for all 
0#m EH, so A =I[(m) because A is a maximal left ideal. But then Rm = 
R/l(m) = R/A is simple, proving (1). 

Finally, we have A C/(H) # R, so A =([(A) by the maximality of A. Thus 
Rm = R/A = R/|(A) by the proof of (2) > (1). a 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, M. J. Pelling, Richard Pfiefer, Leonard Smiley, John Henry 
Steelman, Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before July 31, 1997; Additional information, such as generalizations and refer- 


ences, is welcome. The problem number and the solver’s name and address should 
appear on cach solution. An acknowledgement will be sent only 1f a mailing label 
is provided. An asterisk (*) after the number of a problem or a part of a problem 
indicates that no solution 1s currently available. 


PROBLEMS 


10571. Proposed by Jean-Claude Evard and Hillel Gauchman, Eastern Illinois University, 
Charleston, IL. Let x1, ..., X, be nonnegative real numbers and set x = min {x,..., Xy}. 


Consider 
n a n n 
te= (Son) —Lext= (ort) eh Yo 
i=l i=l i=l] 


fora € R. Ifa is a positive integer, consideration of the terms of Or , xi)” shows that 
Ayw = 0. Show that Ay > 0 for all a € (—on, 1] U [2, ov). 


10572. Proposed by Richard P. Stanley, Massachusetts Institute of Technology, Cambridge, 
MA. Let f(n) be the number of graphs (without loops or multiple edges) on the vertices 
1,2,..., such that no path of length three has vertices i, j,k (in that order) withi < j <k. 
Let g(n) be the total number of subspaces of an n-dimensional vector space over a 2-element 


field. Show that 
xn - x" 
Yd f(a) =e Ya) —. 
0 n! 0 n! 


10573. Proposed by Y.-F. S. Pétermann, University of Geneva, Geneva, Switzerland. Find 
a continuous function f : [0,0o) — [0, oc) satisfying f(0) = O and the functional 
differential equation f’(t) = 1/f(f(t)) for t > 0, and show that no other such function 
exists. 


10574. Proposed by Vartan O. Choulakian, Université de Moncton, Moncton, N. B., 
Canada. Let Si(x) = fo (sint/t) dt denote the sine integral function. Show that 
2 3 


(o-) Si 2 (o-) Si 
@ > ( on) _ > (c) dp" ae _ -—. 


<. Si(nz) l 1 
Sia a - a): 
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10575. Proposed by Xiaokang Yu, Penn State Altoona Campus, Altoona, PA. Prove that 


n 


2l _ m n 
devi (Han Ss SF = ev! (2a +0! 
mao ': 10 l 


[=0 


for every nonnegative integer n. 


10576. Proposed by Donald E. Knuth, Stanford University, Stanford, CA. Alice and Bill 
have identical decks of 52 cards. Alice shuffles her deck and deals the cards face up into 26 
piles of two cards each. Bill does the same with his deck. If any one of Alice’s top cards 
exactly matches any of Bill’s, the matching cards are removed. Play continues until none 
of the cards on top of Alice’s piles matches any of the cards on top of Bill’s piles. What is 
the probability that all 52 pairs of cards will be matched? 


10577. Proposed by Mark Bowron and Stanley Rabinowitz, MathPro Press, Westford, MA. It 
is well known that a maximum of 14 distinct sets are obtainable from one set in a topological 
space by repeatedly applying the operations of closure and complement in any order. Is 
there any bound on the number of sets that can be generated if we further allow arbitrary 
unions to be taken in addition to closures and complements? 


SOLUTIONS 


A Pentagonal Maximum Problem 


6642 [1990, 857]. Proposed by the editors. Let i be the maximum possible inradius of an 
arbitrary triangle lying in the closed set bounded by a regular pentagon of side-length one. 
(a)* Determine A up to an error of at most 1073. 


(b)* Determine A exactly. 
Cf. 6477[1984, 588; 1986, 406; 1989, 945] and 6478 [1984, 588; 1990, 858]. 


Solution of (a) by Richard Stong, Rice University, Houston, TX. With the aid of a computer 
we show that 
0.37550128 < A < 0.37650126. 


First we note that enlarging the triangle increases the inradius. Thus, if a vertex of the 
triangle lies in the interior of the pentagon, then we may increase the inradius by moving 
that vertex (along the angle-bisector, say) until it reaches some edge of the pentagon. Thus 
we may assume that all three vertices of the triangle lie on edges of the pentagon. 


O4 O03 


Os O2 


1-t 
O; 


Figure 6642A Figure 6642B 
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Further, if all three vertices of the triangle lie in one of the open half-planes bounded by 
some diagonal of the pentagon, then we can translate the triangle, with a motion perpendic- 
ular to that diagonal, into the interior of the pentagon and subsequently increase the radius. 
Therefore we may assume that the vertices of the triangle lie on three non-consecutive 
edges of the pentagon, as shown in Figure 6642A. This reduces the problem to considering 
a function defined for (s, t, wu) € [0, 1] x [0, 1] x [0, 1]. We need the following easy lemma. 


Lemma. Let ABC and DEF be two triangles in R* and letr(ABC) andr(DEF ) be their 
inradu. If each vertex of DE F 1s within € of the corresponding vertex of ABC, then 


Ir(ABC) —r(DEFP)| <«. 


Proof. By symmetry it is enough to show that r(DEF) => r(ABC) — e. We may assume 
that r(ABC) > e. Let G be the open circular disk centered at the incenter P of ABC with 
radius r(ABC) — e. Let L be the line tangent to the boundary of G and parallel to AB 
at a distance of € from it. Since D and E are within € of A and B respectively, both of 
these vertices lie on the opposite side of L from C. Hence the segment DE does not enter 
G. Similarly the segments EF and FD do not enter G. Further, a continuity argument 
involving the triangle with vertices (1 -—A)A+AD, 1 -A)B+AE, (1—A)C+AF, where 
0 <A < 1, shows that all of G, and in particular the point P, lies inside the triangle DE F. 
Thus r(DEF) > r(ABC) — € and the lemma is proved. 

We calculated the inradius of each of the (501)? triangles obtained by letting s, t, and u 
eachrun over the 501-element set {0.000, 0.002, 0.004, ... , 0.996, 0.998, 1.000}. A lengthy 
computer run found that the largest inradius of any of these (501)? triangles occurred 
when s = 0.728, ¢ = 1, uw = 0.726 (or when s = 0.726, t = 1, u = 0.728) and 
was 0.375501257.... (A value almost as large, namely 0.3742... occurred when s = 
t = 0.970, u = 0.500). Any triangle with vertices on the three non-consecutive sides of 
the pentagon shown in Figure 6642A has each vertex within 0.001 of the corresponding 
vertices of one of these (501)? triangles. Thus by the lemma A < 0.37650126. By taking 
s = u = 0.726832 and t = 1 we obtained the slightly larger inradius 0.375501286... , so 
that we can infer A > 0.37550128. 


Editorial comment. Part (a): The impetus for this problem was C. A. Rogers’s solution 
of MONTHLY Problem 6478, which was based on the assumption that the inradius of any 
triangle lying in the closed set bounded by a regular pentagon of width 1 has an upper bound 
that is strictly less than 1/4. Since the regular pentagon of width 1 has sides of length 
2 cot(27/5), the maximum inradius of any triangle lying in the closed set bounded by a 
regular pentagon of width 1 1s 2A cot(27/5). Stong’s results show that 


0.2440 < 2A cot(2m/5) < 0.2447 < 1/4, 


which shows that Rogers’s assumption is justified. 
Part (b): The exact value of 4 is 277/(1 + T?) = 0.37550128... , where T = 
0.64257343 ... is the unique number in (0, 1) such that 
4T?/(1 — T?)* = tan(27/5) = ¥5+2V5. 
Alternatively, J is the unique solution in (1/2, 1) of the algebraic equation 


5T © — 2007 4 + 10367 !2 — 12407 !° + 99078 — 4407 + 14077 — 4077 +5 =0. 


The inradius 27?/(1 + T*) occurs for the triangle PO, Q in the position shown in 
Figure 6642B, where O; 0203 040s is the given regular pentagon of side-length 1, PQ 
is parallel to 0304, and angle PO;Q = xm — 4arctanT. The two sides O; P and 0;Q 
of the triangle PO; Q have length 2T7/(1 — T7) = 1.4062346... and the third side PQ 
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has length 47*/(1 + T*) = 1.1688258.... It is not difficult (using the law of sines) to 
conclude further that the segments O2Q and O3P have length 0.72683402.... 

Two solutions establishing the exact value of A were received, one from Lou Hong- Wei 
of Ning-Bo University, Zhejiang Province, People’s Republic of China, and the other from 
Li Wenzhi and Cheng Yiping of the University of Science and Technology, Hefei, People’s 
Republic of China. In addition A. Tissier of Montfermeil, France obtained the same result 
under the restrictive assumption that the triangle in question is isosceles and its axis of 
symmetry coincides with one of the five axes of symmetry of the pentagon. Because of the 
considerable length and complicated nature of these solutions, we do not include a solution 
of (b) here. 


A Special Sequence of Algebraic Integers 


E 3461 [1991, 755]. Proposed by David Callan, University of Wisconsin, Madison, WI. 
Suppose r is a rational number but not an integer. It is known that tan(rz/2) is an algebraic 
number. (Cf. Ivan Niven, /rrational Numbers, Carus Mathematical Monographs No. 11, 
pp. 37-41.) Find the smallest positive integer k, such that k, tan(r7/2) is an algebraic 
integer. 


Solution by Albert Nijenhuis, University of Pennsylvania (Emeritus), Philadelphia, PA, and 
University of Washington, Seattle, WA. If the denominator of r, in lowest terms, is a power 
of an odd prime p, then k, = p; otherwise k, = 1. 

The cyclotomic polynomial ®,,(z) of order n is the polynomial whose (simple) zeros 
are the primitive nth roots of unity { e2tik/n gcd(n,k) = 1 \. Its degree 1s #(n), the Euler 
totient function, and it is irreducible over the rationals. The nth roots of unity are the zeros 
of z” — 1, and 

z"—-1=| | a2). (1) 


d|n 


Let P,(t) be the polynomial (1 — it)? ,((1 + it)/(1 — it)), forn > 2. In view 
of the relation e!? = (1 + itan(@/2))/(1 — itan(@/2)), the zeros of P,(t) are the num- 
bers tan(kz/n) such that gcd(k,n) = 1. Since any factorization of P,,(t) would yield a 
factorization of ®,,(z), P,(t) is irreducible over the rationals. 

When n > 1, the coefficients of z/ and z?“)—/ in ®,(z) are equal. This follows from 
(1) by induction on n and reflects the fact that if ®, (zo) = 0, then , (zp 1) = 0. Ifn > 2, 


then @(n) is even, and we define {a;} by ®,(z) = ree aj[zi + 2?™-J], 
The constant term of P,,(t) is P,(O) = ®,(1), and since 


b(n)/2 teait\i (14 it\oO-s 
nt) = (1 — #8) ¥ «| (73) +() 


j=0 

o(n)/2 FV PMR 4 ip \ O@R-I 
= Ye ai inr™Pa + inh? | (—— +(—> 

é ' 1 +it 1 —it 

j=0 

o(n)/2 . . . 
= Yo at ey! (C1 = i + + in?) 

j=0 


the leading coefficient of P,,(t) is 


p(n) /2 . 
2 aj2(—1)?™/2-3 _ (—1)?™/2@,(-1). 
j=0 
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If n is a positive power of a prime p, then ®,,(1) = p. If n has distinct prime divisors, 
then ®,(1) = 1. This is well known and follows from (1) by an induction argument applied 
to [[¢a:ain,a>1} Pa) = limz.1(2" — 1)/(@ — 1) =n. 

Suppose n > 2. Ifn is odd, then ®,(—1) = 1. Ifn is even, then 


. ge) 
[| ®a(-) = lim =— =n/2. 
(d:d\n,d>2) z>-1 2° — I 


Therefore (again by induction), ®,(—1) = 2 if n is a power of 2, and ®,(—1) = p when 
n is twice a positive power of an odd prime. In all other cases, ®,,(—1) = 1. 

When n is not a power of 2, either ®,,(1) = 1 or ®, (—1) = 1, making P, (t) primitive 
(the greatest common divisior of its coefficients is 1). Otherwise, ®,(z) = z"/* + 1, and it 
is easy to see that 5 P,,(t) is primitive. It follows that the leading coefficient of the primitive 
polynomial P,, (t) (or 5 P,,(t)) equals +1 when n is odd, a power of 2, or twice a number with 
distinct prime divisors. In these cases, we set K, = 1. When zn is twice a positive power of 
an odd prime p, we set K, = p. As aresult, if gcd(k,n) = 1, then K,, tan(km/n) is a zero 
of a polynomial with leading coefficient +1, namely the polynomial K ? wl P,(2/Kn) or 
half of this, and K,, is the minimal such integer. 

Finally, if n is the denominator of the rational number r/2 when written in lowest terms, 
then we set ky = Ky. 


Editorial comment. Another method of proof, used by Robin Chapman and a referee, re- 
placed the detailed study of the cyclotomic polynomials by corresponding properties of the 
algebraic numbers 1 — ¢,, where ¢, 1s a primitive nth root of unity. While the selected 
solution is more elementary, the discovery of the result and the organization of the proof 
can be simplified by the use of fairly well-known algebraic number theory. 


Solved also by R. J. Chapman (U. K.), P. Cizek (Czech Republic), I. Kastanas, O. P. Lossers (The Netherlands), and 
the proposer. 


Singular Values of a Classical Matrix 


10312 [1993, 499]. Proposed by Hongyuan Zha, IMA—University of Minnesota, Minneapo- 
lis, MN. Let c and s be non-negative real numbers satisfying c? + s? = 1. Prove that, for 
n > 1, s"~*,/1 +c is the second smallest singular value of the n by n upper triangular 
matrix 


1 -c -c —C 
1 -—-c —C 

T,(c) = diag(1,s,...,s"7') 
1 —-c 

1 


Solution I by the proposer. Leto = s"~?,/1+c. If s = Oors = 1, the result is ob- 
vious, SO we assume 0 < s < 1. Let vi = (1//2) (0, ...,0, 1, —1), and let ul = 
(1/./2(1 + c)) (0,...,0, 1+ c, —s); these are n-dimensional vectors of length 1. Since 
T,(c)v = ou and T,(c)!u = av, it follows that o is a singular value of 7,,(c). With 
Oo, > +--+ > o, being the singular values of 7;,,(c), the inequality 0,,(T,(c)) < slag 
implies that o 4 o,. Thus, 0,-1(7,(c)) < o with equality when n = 2. This is the basis 
for a proof by induction on n that o,-;(7,(c)) = o. Assume that n > 2 and the result 
is true for n — 1. Using the interlacing property of the singular values and the induction 
hypothesis, we have 


On-2( Tn(C)) = On-2( Tr-1€)) =" > V1 +c > o. 
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Therefore, only 0,1 (7, (c)) can be equal to a. 


Solution II by Leslie Foster, San Jose State University, San Jose, CA. Let R = T,(c). The 
squares of the singular values 0; > --- > o, of R are the eigenvalues of R! R. Asin solution 
I, we may assume that 0 < s < 1. The example given there shows that o = s"~2./1 + cis 
a singular value of R ando $ on. 

For 1 < k <n-—1, let By be then by k matrix that has zero entries except for 1’s on the 
main diagonal and —1’s on the first subdiagonal. Also let t, = s*“-)(1 + 0), and let 8; be 
the set of k-dimensional subspaces of Euclidean n-space R”. We will show that og > /t,, 
so that o, > o fork <n —1. 

By the maxi-min characterization of singular values, 

_ WRyl SR Bexl 


oO, = max min > min 
Ses, yeS |lyll ~ xere ||Bgx| 


The square of the solution to this last minimization problem is the smallest eigenvalue to 
the generalized eigenvalue problem Ax = AMx, where A = Bi R'RBy and M = Bi By. 
If x and A satisfy Ax = MAx, then it follows that xl(A —t,M)x = (A — t&)x™Mx. 
Therefore, if A — tM is symmetric semidefinite and M is symmetric positive definite, 
we may conclude that all the eigenvalues of Ax = 1M*x are at least as large as t,. Since 
B, has full column rank, it follows that M = Bi By is symmetric positive definite. By a 
straightforward calculation, A — t%%.4,;M equals 


2 -s* O =:- 0 2 -1 0... O 

—s2 25% st *:, —-1 2 -1 —° 

(l+c)] 9 -s4 asf 7. 9 [-Gtos*} 9 -1 2 '. 0 
ek es | 

O --- QO =s* 2524 O --- O -1 2 


From this formula, it follows that A — t,M is diagonal semidominant, is symmetric, and 
has nonnegative diagonal entries. It follows by known results that A — t,M is symmetric 
semidefinite. Thus of > tr. 

The program Matlab was helpful in discovering this proof. Numerical experiments 
helped to identify o as an eigenvalue of R! R and to find the form of the matrix A used in 
the proof. 


Editorial comment. The proposer notes that 7,,(c) is a well-known example in numerical 
linear algebra. The QR decomposition with column pivoting of T;,(c) is itself. For c = 0.2 
and n = 100, the (n,n) element of 7,,(c) is about .13, while its smallest singular value 
is about 107°. Therefore, QR decomposition with column pivoting does not reveal the 
near rank deficiency of this matrix. This is given as Example 5.5.1 on p. 245 of G. Golub 
and C. Van Loan, Matrix Computations, 2nd edition, The Johns Hopkins University Press, 
1989. Section 8.3.1 of the same book is one reference for the well-known properties of the 
singular values used in the selected solutions. 


Zeros of a Geometric Series with Random Signs 


10351 [1993, 952]. Proposed by Leopold Flatto and Jeffrey C. Lagarias, AT&T Bell Lab- 
oratories, Murray Hill, NJ. Consider the random power series f(t) = )->°.9 mnt”, where 
the 7; are drawn independently from {—1, 1}, with the probability of 7; = 1 being p for 
alli. (a) If p = 1/2, show that f(t) has infinitely many zeros in the interval (0, 1) with 
probability one. (b) What happens if p 4 1/2? 


Solution of (a) by Richard Holzsager, The American University, Washington, DC. We show 
that, with probability one, f(t) oscillates between arbitrarily large positive and negative 
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values as t approaches 1. Since f is bounded on any interval [0, f], it is enough to show 
that f attains arbitrarily high positive and negative values, and by symmetry, it suffices to 
demonstrate the positive case. 

The proof depends on Kolmogorov’s zero-one law. This deals with a situation where 
there is a sequence X,, X2,... of independent random variables, such as the coefficients 
Nn, and says that any “tail event” has probability 0 or 1. A “tail event” is one that, for any 
n depends only on the values of X,, Xn41,.... 

First note that, for any fixed n, changing the first n signs can change f(t) by at most n. 
Thus, the given property is clearly a tail event. 

We now show that this event does not have probability zero. Suppose M is any large 
integer. Then by the “gambler’s ruin” result in the theory of random walks (see W. Feller, 
An Introduction to Probability Theory and its Applications, Vol. I (third edition), Wiley, 
1968, p. 347, eq. 2.8), >);—~9 ni > M for some n with probability one. Since the polynomial 
fr(t) = dojo nit’ is continuous, it has (with probability one) a value greater than M for 
t close to 1. Choose such at; by symmetry, the tail of the geometric series is nonnegative 
with probability at least 1/2, so Pr ( f(t) > fn(t)) is at least 1/2. Thus, for M = 1,2,... 
we have a decreasing sequence of events { f(t) 1s greater than M for some t }, each with 
probability at least 1/2. The probability of their intersection is the limit of their probabilities, 
so it too is at least 1/2. 


Solution of (b) by Jaime Lobo, Universidad de Costa Rica, San José, Costa Rica. If p 1/2, 
then f has only finitely many zeros in (0, 1) with probability one. Indeed, f is analytic in 
D = {z: |z| < 1} and f(0) £0, so f has only finitely many zeros in every closed interval 
[0, t] with O < ¢ < 1. It thus suffices to show that, almost surely, f(t) is of constant sign 
for ¢ near 1. 

Consider the function F(z) = (1 — z) f(z). It too is analytic in D. Its associated power 
series is dg + )),59(Gn — Gn—1)z”. Then the mth partial sum of this series, evaluated at 
z= 1, is ad». Also, it follows from the strong law of large numbers that 

lim “Ot Fam on] 
m—>0o m+ 1 
almost surely. In this case the Cesaro sum of the series a9 + ) |. (Gn — Gn—1) is also 2p — 1 
and so, from an extension of Abel’s theorem, lim;_,; F(t) = 2p — 1. The sign of F(t), and 
hence that of f(t) agrees with that of 2p — 1, almost surely, for t close to 1. Indeed, f(t) 
approaches either +00 or —oo. 


Editorial comment. Frank Schmidt supplied a reference to Anton Bovier and Pierre Picco, A 
law of the iterated logarithm for random geometric series, Annals of Probability, 21 (1993), 
168-184, in which the asymptotic behavior of f(t) as t — 1 in case (a) is studied in fine 
detail. 


Solved also by J. H. Lindsey II, F. Schmidt, and the proposers. 


Still No Solutions 


10353 [1993, 952]. Proposed by Barry Powell, Kirkland, WA. Show that, for any odd prime 
p, there do not exist nonzero integers x, y, z satisfying 


(x, y)=1 ptxy x°+y=2?, 


Composite solution by Robin J. Chapman, University of Exeter, Exeter, UK, and the proposer. 
If p = 3, the result follows from Fermat’s Last Theorem for exponent 3, so assume p > 5. 
As suggested in the note accompanying the statement of the problem, we use 
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Lemma 1. Suppose a and b are coprime odd positive integers witha = b (mod 4). Ifm 
is an Odd positive integer and Qm = Qm(a, b) = (a” — b™)/(a — b), then Q,, is odd. Ifn 
1s another odd positive integer, then the Jacobi symbols (Qm/Q,) and (m/n) are equal. 


Proof. See Lemma 6.1 of Chapter IV of Paolo Ribenboim, /3 Lectures on Fermat’s Last 
Theorem, Springer-Verlag, 1979. 

Factoring the target equation yields z? = (x* + y’)(x4 — xy? + y*). It follows that the 
greatest common divisor of the latter two factors divides both 3x4 and 3 y*. Since (x, y) = 1, 
the greatest common divisor divides 3. But 3 cannot divide a sum of two relatively prime 
squares, so the factors are relatively prime. Thus x? + y* = v? and x4 — x*y*+ y4 = w?P 
for relatively prime integers v and w. Also, x and y cannot both be odd; otherwise x® + y® 
would be congruent to 2 mod 4 and could not be a pth power. Thus v = w = 1 (mod 4). 

We now see that 3x7 y2 = y2P — wP = (v2 — w)Op(v?, w). The two factors on the 
right are relatively prime. To see this, note that QO, (v*,w) = pw? (mod v* — w). Then, 
gcd(w, v2 — w) = 1 follows from gcd(v, w) = 1; and gcd(p, v* — w) = 1 follows from 
gcd(p, xy) = 1. 

Also, since v and w are relatively prime, v2”? — w? is divisible by 3 only if v® = w = 1 
(mod 3). Thus, On(v?, w) is a square and v* — w is three times a square. Finally, since p 
is prime, there exists a prime q for which (p/q) = —1. The lemma then implies 


—1 = (p/q) = (Qp(v’, w)/ Qq(v”, w)), 


which is impossible since QO, (v?, w) 1S a square. 


On the Number of Ties between Players of Equal Strength 


10355 [1994, 75]. Proposed by Joaquin Gomez Rey, I. B. “Luis Bufiuel”, Alcorcén (Madrid), 
Spain. Two players of equal strength play a tournament consisting of 2n matches. Let T be 
the random variable that counts the number of times the score is tied during the tournament 
(including the initial 0-0). What is E(T) + E(T7)? 


Solution by Dennis P. Walsh, Middle Tennessee State University, Murfreesboro, TN. The 
answer is 2(n + 1). Fork = 0,1,...,n, let To, be the random variable with value 1 if 
the score is tied after 2k games and value 0 otherwise. The score is tied only after an even 
number of games, so T = ) (7—9 T2x, and 


n n 2k | k 
E(T) = )) E(Tax) = D7 (7 )(3) . 
k=0 k=0 


For k > j, we have 


2k —27\ (27\ (14 
E(ToT>;) = Pr(To = 1172 = 1) Pr(Toj = 1) = ( aa )( j ) (3) | 
—J J 4 


Since T, = Tox, this yields 


2k —2j7\ (2j\ (1\* 
crvee( a) op emareamopees9CI( 


j<k J<k 


2k —2 2] 
Using the well-known sum 3 (C, ~”) ( 1) = 4k we obtain 
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se 


eee BEV) MECC) 
2° (3) EC JG) -20 + 


joo \~ J 


Editorial comment. Numerous solvers cited W. Feller, An Introduction to Probability Theory 
and its Applications, Vol. I (third edition), Wiley, 1968, pp. 96 and 110, where the random 
variable T (actually T — 1, the number of returns to zero of the symmetric random walk) 
is carefully studied. The expectation and variance of this random variable are computed in 
J. Riordan, Combinatorial Identities, Wiley, 1968, p. 31-32, under the guise of the so-called 
Banach matchbox problem. José Luis Palacios supplied a reference to P. Kirschenhofer and 
H. Prodinger, “The higher moments of the number of returns of a simple random walk”, 
Adv. Appl. Prob. 26 (1994), 561-563, where a generalization of this problem is studied. 


Solved also by R. A. Agnew, M. H. Andreoli, D. Beckwith, P. Budney, R. J. Chapman (U. K.), D. A. Darling, R. P. 
Dobrow, R. Ehrenborg (Canada), R. A. Groeneveld, V. Hernandez (Spain), R. Holzsager, R. D. Hurwitz, O. Krafft 
(Germany), J. H. Lindsey II, O. P. Lossers (The Netherlands), G. Loudner, L. E. Mattics, J. L. Palacios (Venezuela), 
D. E. Rauschenberg & Jian-Min Li, F Schmidt, N. C. Singer, H. L. Stubbs, M. Vowe (Switzerland), H. Weingarten, 
E. A. Weinstein, A. N. ’t Woord (The Netherlands), D. Zeilberger, NSA Problems Group, and the proposer. 


A Sequence of Squares 


10356 [1994, 75]. Proposed by Shalosh B. Ekhad, Princeton, NJ. Let X, be defined by 
Xo = 0, X; = 1, X2. = O, X3 = 1, and forn > 1, 


(n*+n+1)(n+ Dy 
n 


n+1 


Xn43 = n42t(n*+n+V)Xngi - ——Xn. 


Prove that X,, is the square of an integer for every n > O. 


Solution by Donald A. Darling, Newport Beach, CA. Define a sequence {c,,} by setting 
co = 0, c) = 1, and Cy42 = NCy41 + Cy for n > 0. Then cy43 = (n + 1)cn42 + Cn41, and 
Cn = Cni.2 — NCy41. Squaring these two equations yields 


C43 = (Nt 1) Cy +h $2 + Denson 


2 2 2 2 
Ch = Crag ENC, — 2NCn42Cn4+1- 


Eliminating C,42Cn+1 yields 


(n?7+n+4+1)(n4+1) 
Cy =a Coy t(n? tnt 1c, - 


n+1 


2 
Che 


Since cz = 0 and c3 = 1, the sequence Cc satisfies the same recurrence as X,, with the same 
initial values, so X, = c? for all n > 0. 


Editorial comment. E. Sparre Anderson and Mogens Esrom Larsen noted that the sequence 
Cn 1s sequence number 704 in N. J. A. Sloane’s Handbook of Integer Sequences. The solution 
of Murray S. Klamkin also begins by using this reference to identify the sequence c,, and its 
recurrence from co, C1, ..., C6. Istvan Nemes noted that tools for using initial values of the 
sequence c,, to determine a recurrence, and then finding the recurrence satisfied by the Cc? 
are available in the gfun package in the Analysis part of the Maple share library. However, 
direct application of these tools shows only that c? satisfies a fourth order recurrence instead 
of the given third order recurrence as found in the selected solution. To complete the proof, 
a recurrence is derived for the difference between this sequence and the given one. It turns 
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out to be the fourth order recurrence constructed to give the c? . Since the first four terms 
are seen to be zero, the whole sequence is zero. 


Solved also by J. Alvarez (Spain), E. S. Andersen & M. E. Larsen (Denmark), J. Anglesio (France), B. D. Beasley, 
D. Beckwith, K. L. Bernstein, R. E. Bernstein, J. C. Binz (Switzerland), S. Byrd, D. Callan, R. J. Chapman (U. K.), 
O. Chen, J. Christopher, E. Cohen (France), J. M. Cohen, C. K. Cook, P. Cull, P. Deiermann, R. L. Doucette, J. S. 
Frame, Z. Franco, R. A. Groeneveld, R. Heller, R. Holzsager, M. S. Klamkin (Canada), D. R. Lepro, N. F. Lindquist, 
O. P. Lossers (The Netherlands), G. Loudner, J. B. Muskat & J. Schiff (Israel), I. Nemes (Austria), A. Nijenhuis, 
R. Richberg (Germany), R. M. Robinson, F. Schmidt & C. Forbin, S. Shaffer, M. Shemesh (Israel), N. C. Singer, 
M. Vowe (Switzerland), A. N. ’t Woord (The Netherlands), A. Yandl, D. Zeitlin, NSA Problems Group, University of 
South Alabama Problem Group, and the proposer. 


Another Way to be Catalan 


10357 [1994, 75]. Proposed by Ira Gessel, Brandeis University, Waltham, MA. Define 
integers dm» by 
1 0O 
a myn 
l—u—v+2uv Dd Anni " 


m,n=0 
Show that (—1)/a2;,2j42 is the Catalan number 7) /G +1). 


Solution I by Chu Wenchang, Academia Sinica, Beijing, China. Consider the formal power 
series expansion 


oO k 
1 _ 1 -y (—1)¥u 
l-u-—v+2uv (1—u)(1—v) 1+ qa 4 (1 — uy — A 


soit be EO) 


i,j,k= k>0 


We conclude that @mn = >-4>9(—1)*(7)(Z). This convolution is the coefficient of x” 
in(1+x)"(1 —x)”. With m = 27 andn = 2] +2, this generating function becomes 
(1 — x”)2/(1 — x)?, in which the coefficient of x*/ is 


i 28) eal 24) yi (24) 1 
apaisa = 6 (GF) + (7) =" GD best 


Solution II by Bruce Sagan, Michigan State University, East Lansing, MI. Let A-paths be 
lattice paths in which each step adds (1, 0) or (0, 1) or (1, 1) to the current position; call 
these steps H, V, D, respectively. To each A-path with d diagonal steps, assign the weight 
(—2)¢. By considering the possibilities for each successive step, the coefficient of u! v/ in 
the generating function 7; 9 ai,ju'v/ = (1 —u —v+2uv)"! = Do, g(u + v — Qu)” 
is the sum of the weights of the A-paths from (0, 0) to @, /). 

Let B-paths be A-paths that have the same number of H-steps and V-steps and never 
rise above the line y = x. Let b; be the sum of the weights of the B-paths from (0, 0) 
to (i,i), and let B(z) = i>0 biz'. A B-path P of positive length is P = (D, B-path) 
or P = (H, B-path, V, B-path). Since the formal variable in B(z) records movement 
in the horizontal coordinate and every D contributes a factor of —2, we obtain B(z) = 
1+(—2)zB(z)+zB(z)*. From the quadratic formula, B(z) = ( 1+2z—-J71+ 42? )/(2z). 

Let Fx(z) = Do js0 Qj i+¢z', and let A’-paths be A-paths with the same number of H- 
steps as V-steps. An A’-path of positive length consists of (D, A’-path) or (H, B-path, 
V, A’-path) or the transpose of the latter. Hence Fo(z) = 1 + (—2)zFo(z) + 2zB(z), and 
Fo(z) = 1/(1 + 2z — 2zB(z)) = 1/(V 1+ 42). 
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An A-path from (0, 0) to (i, i + 2) must have the form (B-path, V, B-path, V, A’ -path), 
where the two V’s represent the first time the path touches y = x + 1 and y = x +2. This 
yields Fo(z) = B(z)? Fo(z). 

The desired value a2; .2;+2 is the coefficient of z% in G(z) = (Fo(z) + F)(—z))/2 = 
(—1+J71 + 4z2)/(2z7). The well-known generating function for the Catalan numbers is 
C(w) = Yiisg Ciw' = (1— V1 — 4w)/(2w). Since C(—z?) = G(z), we have aj.2;42 = 
(1 Cj. 


Editorial comment. Several solvers used more sophisticated tools. Frank Schmidt used 
MacMahon’s Master Theorem, Shalosh B. Ekhad used the method of WZ pairs, and Rolf 
Richberg used hypergeometric notation and Jacobi polynomials. 


Solved also by J. Anglesio (France), K. L. Bernstein, G. A. Bookhout, A. E. Caicedo Niifiez (Colombia), R. J. Chapman 
(U. K.), D. A. Darling, S. B. Ekhad , A. Firasath & C. C. Rousseau, R. Holzsager, D. R. Lepro, J. H. Lindsey II, O. P. 
Lossers (The Netherlands), G. Loudner, G. Miller (Canada), I. Nemes (Austria), R. C. Read (Canada), R. Richberg 
(Germany), F. Schmidt, M. Vowe (Switzerland), J. Wimp, A. N. ’t Woord (The Netherlands), A. Zeleke, Howard 
University Combinatorics group, and the proposer. 


Introducing the Eigenvalue 1 


10362 [1994, 175]. Proposed by Hans Liebeck and Anthony Osborne, University of Keele, 
Keele, England. Let A be areal orthogonal matrix without eigenvalue 1. Let B be obtained 
from A by replacing one of its rows or one of its columns by its negative. Show that B has 
1 as an eigenvalue. 


Solution by Richard Holzsager, The American University, Washington, DC. If 1 is not an 
eigenvalue of a real orthogonal n by n matrix, then all real eigenvalues equal —1. The 
remaining eigenvalues occur in conjugate pairs with product 1. Therefore, if 1 is not an 
eigenvalue, the determinant is (—1)”. Since det A = —det B, it is impossible for both 
matrices to have this property. 


Editorial comment. Both Murray S. Klamkin and Godfrey Loudner cited the following 
theorem from which the solution follows immediately: If an orthogonal n-by-n matrix A 
has determinant 1 when n is odd or —1 when n is even, then 1 is an eigenvalue of A. (See 
L. Mirsky, An Introduction to Linear Algebra, Oxford University Press, 1972, p. 226). 


F, Schmidt and Tad White each noted that this also follows from the fact that a continuous 
map from a sphere to itself without fixed points is homotopic to the antipodal map. 


Since B is the composition of A with a reflection, this construction provides the inductive 
step in the proof that an n-by-n real orthogonal matrix is a product of at most n reflections. 
Such a result holds for more general orthogonal groups consisting of those linear transfor- 
mations of an n-dimensional space over an arbitrary field of characteristic different from 
2 preserving a non-degenerate quadratic form, a result known as the Cartan-Dieudonné 
theorem (see E. Artin, Geometric Algebra, Interscience, 1957, Theorem 3.20, p. 129). In 
addition to requiring calculations over more general fields, which appeared in some solutions 
of the problem 10362, the proof of the Cartan-Dieudonné theorem requires consideration 
of the possibility of isotropic vectors (vectors orthogonal to themselves). However, the 
result needed here is easily recovered from the statement of this more general theorem. 
This approach was mentioned in three solutions (by H. Guggenheimer, Allan Pedersen, and 
Daniel B. Shapiro). 


Solved by 44 readers (including those cited) and the proposer. 
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More Binomial Coefficients 


10363 [1994, 175]. Proposed by Joseph M. Santmyer, California University of Pennsylvania, 
California, PA. If m,n are integers satisfying 1 < m < n — 1, prove that 


2n—m—1 n—-1\ _ k+j\ (2n-—m—2k —j -3 
(oon) -( m )-Ex( k )( 2(n —-m—k-—1) ) 


Solution by Michael Vowe, Therwil, Switzerland. We can evaluate the sum on j by equating 
coefficients of x”—! in 


(1 _ x) HD _ x) @n—2m—2k—1) _— a _ x)~(2n—2m—k) 


using the binomial theorem expansion of (1 — x)~*%. We obtain for the right side of the 
proposed identity 


‘St (rom?) St (-mom ' ero] 
120 m— 1 tn m m 
2n—m—1 n—1 
m m )}° 
Solved also by E. S. Andersen & M. E. Larsen (Denmark), J. Anglesio (France), J.C. Binz (Switzerland), G. A. Bookhout, 
R. J. Chapman (U. K.), H. van Haeringen (The Netherlands), R. Holzsager, O. P. Lossers (The Netherlands), G. Loudner, 


C. A. Minh, I. Nemes (Austria), R. C. Read (Canada), E. Schmeichel, T. White, Anchorage Math Solutions Group, NSA 
Problems Group, and the proposer. 


which equals the left side. 


A Bijection Between Sets of Permutations 


10364 [1994, 176]. Proposed by Frank Schmidt, Arlington, VA. Let S2, denote the sym- 
metric group of degree 2n. Let E2, (respectively O2,) be the set of those permutations 
in S2, all of whose cycle lengths are even (respectively odd). Show that E2, and O2, are 
equinumerous by finding an explicit bijection between them. 


Solution by David Callan, University of Wisconsin, Madison, WI. We say that a permutation 
m € S, isin standard form if its cycles are arranged so that the smallest element in each cycle 
occurs in the first position and these first elements are increasing left to right (for example, 
(1,5, 2)(3, 6, 7)(4)(8) is in standard form). Each element of O2, has an even number 
of cycles. Suppose 7 € Od, 1S 172 --- 2% in standard form. Form the corresponding 
mw’ € E>, as follows. Fori = 1,2,...,k, move the second element of 7; to the end of 
12; unless 72; is a singleton (fixed point), in which case move that element itself to the 
end of 2;—1 (and delete 7r;). For example, if m = (1,5, 2)(3, 8, 7)(4)(6)(9, 12, 10)(11), 
then 2’ = (1,5, 2, 8)(3, 7)(4, 6)(9, 12, 10, 11). Note that zr’ is also in standard form. 

The map 7 — mz’ is the desired bijection. Its inverse is constructed as follows. Given 
m’ € E2, written as 7\72--- 7; in standard form, let a denote the last element of zr), and 
let b denote the first element of 772 Gf j > 1). If a exceeds b, then move a from 7 so that 
it immediately follows b in 2 and begin the process anew at 73 (if present). If b exceeds 
a or if j = 1, then make a a singleton placed immediately after 2, and begin the process 
anew at 72 (if present). 


Editorial comment. Letting Tz, be the subset of E2, consisting of the permutations whose 
cycles all have length 2, Robert Steinberg also exhibited bijections between E2, and To, x Try, 
and between O2,_; and the subset of 72, x T2, consisting of all (a, 8) such thata(1) = B(1). 


Solved also by D. Beckwith, O. P. Lossers (The Netherlands), R. Steinberg, A. N. ’t Woord (The Netherlands), and 
the proposer. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


The Encyclopedia of Integer Sequences, by N. J. A. Sloane and Simon Plouffe. 
Academic Press, New York—London, 1995, xiv + 587, $44.95. 


Reviewed by Richard Guy 


John Conway calls the Encyclopedia “the best present I’ve had in years’. It’s a third 
of a century ago since Motzkin observed that the most mathematics you could get 
for your dollar was in Abramowitz & Stegun [1], and this is probably still true 
today. But surely one of the best contenders for second place must be the 
Encyclopedia, the more than twice times enlarged edition of the Handbook [14]. 
We will refer to sequences in the Handbook and in the Encyclopedia by their N 
and M numbers. Not only does the Encyclopedia contain an enormous amount of 
mathematics, but it also contains what is even more important, an enormous 
amount of potential mathematics. It is hard to think of a branch of mathematics 
where it won’t be useful, and very easy to think of other subjects where it will. 

Most of us confine our use of the Encyclopedia to diving into the middle and 
looking for the sequence of our current interest, but it’s well worth taking the time 
to read Chapter 1, which tells you how to get the best out of the book, Chapter 2 
on handling a strange sequence, and Chapter 3 on further topics. For example, in 
Chapter 2 we find a section on transformations: exponential, logarithmic, Euler, 
Mobius and binomial transforms. Here are two further examples which didn’t 
make it in time for publication: the boustrophedon transform and a transform 
arising from a problem of Recaman. 

A fascinating Pascal-like triangle occurs in recent work of Arnol’d [2,3]; start 
with 1; the next row starts with 0 and accumulates the members of the previous 
row, 0 1; we are now on the right and the next row starts there with O and 
accumulates from right to left, 1 < 1 < 0; then start again with 0 and accumulate 
from left to right, 0 1 2 2; and so on in the manner of the plough (Fig. 1). 


1 
0 -> 1 
1<-t1< 0 
0 >~>172>2 
5S -—5 -—« 4 <-—2 < 0 
0 >~35 ~10 — 14 > 16 - 16 


61 < 61 < 56 < 46 < 32 < 16 < O 
0 —- 61 -122 -178 -— 224 > 256 > 272 > 272 


Figure 1. Numbers of updown permutations of 12... ending in k. 
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Do you recognize the numbers in the border? The left border contains the zig 
numbers, traditionally called the Euler numbers or secant numbers; and the right 
border contains M2096, the zag or tangent numbers. What we have done is start 
with the sequence 10000 ..., the coefficients in the exponential generating 
function for 1, and transformed it into M1492—how did Columbus get in there? — 
11125 16 61..., those in the exponential generating function for 


Tex 1:x* 2x? S5+x* = 16-x? 61+ x® 
secx + tanx=1+ —— + a + 3! + A! + 5! r 6! 


272 °x'! 1385 -x® 7936 - x” 
———— + ———___ + ——___+ 
7! 8! 9! 


a function (formerly?) familiar to first year calculus students on learning ‘methods 
of integration’. More generally, Jessica Millar [11] discovered the simple relation 
between the exponential generating functions of the input and the output. For 
example, if we start with 11111... @e., e*) we obtain 112492477 ..., with 
e.g.f. e*(sec x + tan x), a near miss for M1194, the number of rhyme schemes, or 
M1195, the number of 2-connected planar maps. For a surprise, try feeding in the 
sequence 111251661 ... itself. 

The other transformation arises from the sequence of Recaman [12], namely 
a, = 1, and, form > 1, a,,, =a,/n or a, Xn, according as n divides a, or not. 
Sloane has generalized this into the transformation from {a,} to {b,} given by 
b, =a, and b,,, = Iem(b,, a,,,)/gcd(0,, a,,,,). So Recaman’s sequence is 1 2 6 
24 120 20 140 1120 10080 1008 11088 924 12012 858 ..., while Sloane’s transform 
of 123456... is 12 6 6 30 5 35 280 2520 252 2772 231 3003 858 1430 .... 

One way in which the Encyclopedia could be made even more useful would be 
by including more arrays. The authors are aware of the problem. The only arrays 
that I found are M0663, the partitions; M1645, Pascal’s (really Omar Khayyam’s) 
triangle; M1722, which gives a way of multiplicatively encoding arrays (the rubric 
should read 2' = 2, 2'3' =6, 2'375' = 90, 213°5°7! = 47250, 2'345°7*11' = 
66852843750); M3416, Euler’s triangle; and M4730 & M4981, the Stirling numbers 
of the first & second kinds. 


[As we go to press, Sloane tells me that 102 arrays have been added to the 
database by reading them by rows, or by diagonals if they’re rectangular. In 
particular, Pascal’s triangle comes out as 


11112113311464115101051... 


and the number of partitions of n into k parts as 
111111121112211133211134321114553211...] 


Here are a couple of examples (in addition to our Fig. 1) of what might have 
appeared, though three- and more-dimensional arrays present even greater prob- 
lems. . 

Suppose that you are interested [8,10] in the number of walks, w,(x, y), of n 
steps, each in the direction N, S, E or W, starting from the origin and ending at the 
lattice point (x, y), which do not stray outside the positive quadrant. The numbers 
of such walks form a “Pascal quarter-pyramid” (Fig. 2). 
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840 1134 350 28 


1344 840 160 8 


Figure 2. The layers, 0 < n < 8, of a Pascal quarter-pyramid: values of w,(x, y). 


It can be shown that the entries in Fig. 2 are given by 


WS) (Tea) 

r s r+1)\s—-1 

where r= 3(n +x —y), s = 3(n —x —y). Which rows, columns and diagonals 
are in the Encyclopedia? The numbers of such walks with 2k + 2 steps from (0, 0) 
to (0,2), or from (0, 0) to (1, 1), or half this last number, do not appear to be there. 
Of course, the number of walks from (0, 0) to (k, k) in 2k steps, is well known, and 
is sequence M1645. 

A remarkable coincidence is that M1972, the entries w,,(0, 0) in Fig. 2, i.e., the 
product of successive Catalan numbers, c,c,,1, is twice w,,_,(0, 1), the number 
M3978, of inequivalent Hamiltonian rooted maps on 2k vertices; although Tutte 
[15] doesn’t give the formula in that form. Is there another opportunity for a purely 
combinatorial proof? 

For walks in the positive quadrant it’s more natural and symmetrical to ask for 
the numbers of walks which terminate at various distances from the origin, using 
the “Manhattan metric’, x + y =n — 2s. Fig. 3 shows the sums of the diagonals of 
Fig. 2. 


w(x, y) 


1 0 1 
2 1 2 
6 2} 2 4 
18 3 10 8 
60 4} 10 34 16 
200 | 5 70 98 32 
700 | 6 | 70 308 258 64 
2450 | 7 588 1092 642 128 
8820 | 8 | 588 3024 3414 1538 256 
31752 | 9 5544 12276 9834 3586 512 
116424 5544 31680 43230 26752 8194 1024 
426888 56628 141570 138424 69784 18434 2048 


Figure 3. Sums of diagonals of Fig. 1: values of w(x + y) = wi(n — 2s). 
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The entries in Fig.3 are 


wcxty=("*7\[[")a ("a 


n n+2 n+2 n+2 
feet ended res (7224) 
Except for small values of s, the truncated binomial expansions do not seem to 
have a simple closed form: 


wi(n) = 2” 
wi(n — 2) =(n—2)2" +2 
wi(n — 4) = = (n? — 5n + 2)2" +n? + 3n - 2 


An amusing curiosity is that w/(m — 2) is twice the genus of the (7 + 2)-dimensional 
cube M3874 [13, or see Theorem 14 in 9], though here it may be less fruitful to 
look for a direct combinatorial connexion. 

Of course, the first diagonal is there as M1129, and so is the third, M4723, but 
not the fourth. Subsequent ones are ruled out by the Encyclopedia’s Rule 3, since 
the first term which exceeds one also exceeds 999, but the online version of the 
Encyclopedia (which by now has more than 12000 entries!) now includes such 
sequences. 

The unrelenting cascade of numbers is now relieved by numerous figures, some 
of which give us pictures of the actual objects being counted. Others list Hard, 
Disallowed and Silly sequences! Many mistakes have been corrected. N0268, in 
which Cayley tried to count hydrocarbons, is now supplanted by M0718. N1186 & 
N1635, where Cayley made some errors in connexion with his game of Mousetrap 
[5], have been corrected to M2945 & M3962, and N1423 has been extended to 
M3507. N1623 turned out to have some erroneous values. It should have been the 
same as N1622, which is now M3939; one of a wealth of examples where the same 
sequence occurs in quite different contexts (combination locks, evaluation of an 
integral, barycentric subdivisions of a simplex). 

Many sequences now contain more terms than earlier, but several are crying out 
for similar treatment. In some cases even one more term would be a noteworthy 
contribution to mathematics. How many projective planes are there of order 11? 


M5482: what is.the number of magic squares of order 6? 

M2817: what is the number of topologies on a set of 8 elements? 

M1197: how many geometries (matroids) are there with 9 points? 

M1585: what is the maximum kissing number [4] for a 10-dimensional lattice? 
M3690: how many reduced latin squares are there of order 11? 

M1495: how many partially ordered sets are there with 14 elements? 

M0219: how many 26-dimensional unimodular lattices [4] are there? 

M3736: what is the number of inequivalent Hadamard matrices of order 32? 


Of course, we must beware of the Strong Law of Small Numbers! What is the 
sequence 


23 5 7 11 #13 17 #19 #23. ...? 
Is it M0651 or M0652? Or even M0653? How about 
13 7 19 51 141 #393 1107 3139 ? 
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Is the next term 8953 or 8955 [7] ? And should the sequence 


1,0, 1,1, 1, 2, 2, 3,3, 4,5, 6, 7,9, 10, 12, 14, 17, 19, 23, 26, 


continue as in M0265, or as in M0266, or possibly [6] with 31, 35, 41, 46, 54, 60, 69, 
78, 89,... ? 

I won’t give the email address for the online version, nor that of superseeker, 
in case the network seizes up from all your enquiries. Buy the book and find out all 
about them. 
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TELEGRAPHIC REVIEWS 


Edited by Arnold Ostebee 


with the assistance of the Mathematics Departments of 
Carleton, Macalester, and St. Olaf Colleges 


Telegraphic Reviews are designed to alert readers in a timely manner to new books 
appropriate to mathematics teaching and research. Special codes classify reviews by 


subject area and appropriate use: 


T : Textbook 
C : Computer Software 


S : Supplementary Reading 


P : Professional Reading 
L : Undergraduate Library 
13: Grade Level 


Readers are advised that price information is subject to change. 


1-4: Semester 

** * Special Emphasis 

?? : Questionable 
Selected books 


receive a second, more extensive review in the Monthly. 


Books submitted for review should be sent to Book Reviews Editor, American Mathe- 
matical Monthly, St. Olaf College, 1520 St. Olaf Avenue, Northfield, MN 55057-1098. 


General, P. Einstein Atomized: More Science 
Cartoons. Sidney Harris. Copernicus (Imprint: 
Springer-Verlag), 1996, $14 (P). [ISBN 0-387- 
94665-9] 


General, L. The Nature of Space and Time. 
Stephen Hawking, Roger Penrose. Princeton 
Univ Pr, 1996, ix + 141 pp, $24.95. [ISBN 
0-691-03791-4] Is quantum mechanics a final 
theory? Can it be combined with general rel- 
ativity to produce quantum gravity? In 1994, 
in conscious imitation of the famous Einstein- 
Bohr debate, Hawking and Penrose delivered a 
series of lectures and a capstone debate on the 
state of physics. Penrose, like Einstein, believes 
that the real world exists and that physicists ex- 
ist to explain it. Hawking counters, “...I don’t 
know what it [reality] is,”’ and maintains that 
physicists exist to construct theories that accu- 
rately predict the results of measurements. SK 


Mathematics Appreciation, P. Get a Grip on 
Your Math. William J. Adams. Kendall/Hunt, 
1996, xiii + 256 pp, $18°95 (P). [ISBN 0-7872- 
1561-9] Illustrates and discusses the use and 
mis-use of numbers and mathematical models 
in everyday life. DH 


Mathematics Appreciation, P. Get a Firmer 
Grip on Your Math. William J. Adams. 
Kendall/Hunt, 1996, vii + 290 pp, $18.95 (P). 
[ISBN 0-7872-1562-7] Provides further in- 
vestigation and discussion questions for ideas 
developed in Get a Grip on Your Math. DH 


Education. The Role of Mathematics in Mod- 
ern Engineering. Eds: Alan K. Easton, Joseph 
M. Steiner. Studentlitteratur, 1996, 724 pp. 
[ISBN 91-44-00058-8] Proceedings of the Ist 
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Biennial Engineering Mathematics Conference, 
AEMCM, held in July 1994 in Melbourne, Aus- 
tralia. 


History, P, L*. Celebrating Women in Math- 
ematics and Science. Ed: Miriam P. Cooney. 
NCTM, 1996, vii + 223 pp, $22.50 (P). 
[ISBN 0-87353-425-5] Intended for middle 
and junior high school students, these 22 
biographies—from Hypatia to Dian Fossey to 
Mary Ellen Rudin—are captivating readings for 
everyone. Each biography ends with suggested 
readings. DH 


History, S, P*, L*. Sources of Hyperbolic Ge- 
ometry. John Stillwell. History of Math., V. 10. 
AMS and London Math Society, 1996, ix + 
153 pp, $39. [ISBN 0-8218-0529-0] Intro- 
ductory commentaries (with lists of references) 
and English translations of papers on hyperbolic 
geometry by Beltrami, Klein, and Poincaré. A 
great resource for those interested in the devel- 
opment of non-Euclidean geometry. JNC 


History, L. Modern Algebra and the Rise of 
Mathematical Structures. Leo Corry. Science 
Networks: Historical Studies, V. 17. Birkhauser 
Boston, 1996, 460 pp, $139. [ISBN 0-8176- 
5311-20 The development of the idea of math- 
ematics as a study of structures as exemplified 
by ideal theory and category theory. Concen- 
trates on contributions of Dedekind, Hilbert, 
Fraenkel, Noether, Ore, and the Bourbaki 
group. DB 

Logic, P. Logic and Algebra. Eds: Aldo 
Ursini, Paolo Agliano. Lect. Notes in Pure & 
Appl. Math., V. 180. Marcel Dekker, 1996, 
xv + 702 pp, $175 (P). [ISBN 0-8247-9606- 
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3] Proceedings of a 1994 conference in Siena 
(Italy) honoring Roberto Magari. 


Combinatorics, P. Computational and Con- 
structive Design Theory. Ed: W.D. Wallis. 
Math. & Its Applic., V. 368. Kluwer Academic, 
1996, xiv + 357 pp, $165. [ISBN 0-7923- 
4015-9] 11 papers (2 tutorial) on computa- 
tional techniques in constructive design theory. 


Number Theory, T(18: 1). Additive Num- 
ber Theory: The Classical Bases. Melvyn B. 
Nathanson. Grad. Texts in Math., V. 164. 
Springer-Verlag, 1996, xiv + 342 pp, $49.95. 
[ISBN 0-387-94656-X] Thorough account of 
Waring’s problem (number of representations 
of an integer as a sum of a specified number 
of specified powers) and Goldbach’s conjecture 
(the existence of a representation of an even in- 
teger as a sum of two primes). Describes the 
circle method and sieve techniques. With his- 
torical comments and exercises. DB 


Number Theory, T(18: 1). Additive Num- 
ber Theory: Inverse Problems and the Geom- 
etry of Sumsets. Melvyn B. Nathanson. Grad. 
Texts in Math., V. 165. Springer-Verlag, 1996, 
xiv + 293 pp, $49.95. [ISBN 0-387-94655- 
1] A variety of results in additive number the- 
ory including inverse problems (determine the 
original set from the sums of elements) and 
problems involving congruence classes, lattice 
points, graphs, and combinatorics. With histor- 
ical notes and exercises. DB 


Linear Algebra, T(16-18: 1, 2), S, P, L. 
A Polynomial Approach to Linear Algebra. 
Paul A. Fuhrmann. Universitext. Springer- 
Verlag, 1996, xiii + 360 pp, $39 (P). [ISBN 
0-387-94643-8] Not an introductory linear 
algebra text; accessible only after previous 
courses in linear algebra and algebraic struc- 
tures. Supplies an interesting gateway to sev- 
eral non-standard topics including shift opera- 
tors, quadratic forms, system theory, and Han- 
kel norm approximation. Formal and rigorous; 
linear transformations are studied by looking at 
module structure induced by rings of polyno- 
mials. Exercises are largely theoretical, little 
routine computation. JS 


Algebra, T(18), P. Foundations of Quantum 
Group Theory. Shahn Majid. Cambridge Univ 
Pr, 1995, xix + 607 pp, $100. [ISBN 0-521- 
46032-8] Emphasizes the algebraic theory of 
quantum groups and (more generally) Hopf al- 
gebras. Provides detailed proofs, thorough mo- 
tivations, and a diverse choice of topics. TH 


Algebra, T(18), P. Cohomology of Drinfeld 
Modular Varieties, Part I: Geometry, Counting 
of Points and Local Harmonic Analysis. Gérard 
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Laumon. Stud. in Adv. Math., V. 41. Cam- 
bridge Univ Pr, 1996, xiii + 344 pp, $64.95. 
[ISBN 0-521-47060-9] Self-contained devel- 
opment of the function field analogs of Shimura 
varieties over number fields. TH 


Algebra, P. The Group Fixed by a Family 
of Injective Endomorphisms of a Free Group. 
Warren Dicks, Enric Ventura. Contemp. Math., 
V. 295. AMS, 1996, ix + 81 pp, $19 (P). 
[ISBN 0-8218-0564-9] An algebraic proof of 
the Bestvina-Handel Theorem, which gives an 
upper bound on the rank of the fixed group of 
an automorphism of a free group. TH 


Algebra, P. Nilpotent Lie Algebras. Michel 
Goze, Yusupdjan Khakimdjanov. Math. & Its 
Applic., V. 361. Kluwer Academic, 1996, xv + 
336 pp, $170. [ISBN 0-7923-3932-0] 


Algebra, P. Rings, Groups, and Algebras. 
Eds: X.H. Cao, et al. Lect. Notes in Pure & 
Appl. Math., V. 181. Marcel Dekker, 1996, 
Viii + 332 pp, $150 (P). [ISBN 0-8247-9733- 
7) Survey articles and recent research results. 
Summarizes the most significant developments 
in rings and algebras made in China since the 
1950’s. 


Calculus, S(13). Calculus: A Lab Course with 
MicroCalc. Harley Flanders. Springer-Verlag, 
1996, xi + 332 pp, $39.95 (P). [ISBN 0-387- 
94496-6] A lean calculus text using the au- 
thor’s software. PF 


Calculus, $?(13). A Concise Introduction to 
Calculus. W.Y. Hsiang. Ser. on Univ. Math., 
V. 6. World Scientific, 1995, vit + 157 pp, 
$21 (P); $39. [ISBN 981-02-1901-6; 981-02- 
1900-8] A meager attempt to explain calculus 
in a minimum number of pages. PF 


Calculus, T(13: 1). Brief Calculus with Ap- 
plications in Business and the Social and Life 
Sciences. Daniel C. Alexander. H & H Pub, 
1996, x + 806 pp, $39.95. [ISBN 0-943202-51- 
5] Standard textbook for business and liberal 
arts students. Makes little attempt to do any- 
thing modern. PF 


Calculus, S*(13). Maple V Calculus Labs, Sec- 
ond Edition. Abi Fattahi. Brooks/Cole, 1996, 
ix + 109 pp, $16.95 (P). [ISBN 0-534-26208-2] 
An excellent supplement. The labs are clearer 
than in the previous edition and include new 
examples. PF 

Real Analysis, T*(15-17: 1, 2), S*, L. Ad- 
vanced Calculus: A Course in Mathematical 
Analysis. Patrick M. Fitzpatrick. PWS, 1996, 
xix + 555 pp, $86.25. [ISBN 0-534-92612-6] 
A clear, readable, comprehensive introduction 
to a wide variety of topics in real analysis, rich 
with examples and exercises. Culminates with 
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integral formulas of Green and Stokes in the 
plane and space. PZ 


Complex Analysis, T(18: 1). Complex Anal- 
ysis and Special Topics in Harmonic Analysis. 
Carlos A. Berenstein, Roger Gay. Springer- 
Verlag, 1995, x + 482 pp, $79. [ISBN 0- 
387-94411-7] Boundary values of holomor- 
phic functions, ideal theory in algebras of entire 
functions, the G-transform, summation meth- 
ods, and questions in harmonic analysis such as 
how to recover a continuous function from its 
averages over intervals of different lengths. DB 


Partial Differential Equations, T(18: 2), L. 
Partial Differential Equations. Michael E. Tay- 
lor. Springer-Verlag, 1996, $69 each. I: Basic 
Theory, Appl. Math. Sci., V. 115, xxi + 563 pp, 
[ISBN 0-387-94653-5]; IJ: Qualitative Studies 
of Linear Equations, Appl. Math. Sci., V. 116, 
xxi + 528 pp, [ISBN 0-387-9465 1-9]; III: Non- 
linear Equations, Appl. Math. Sci., V. 117, xxi 
+ 610 pp. [ISBN 0-387-94652-7] (I: Basic 
Theory is available in paperback as V. 23 in the 
Texts in Applied Mathematics series.) Good 
introduction to PDE’s. Thorough classical cov- 
erage, though some important modern topics 
(e.g., solitons) are missing. Many exercises. JO 


Dynamical Systems, P. Dynamical Systems of 
Algebraic Origin. Klaus Schmidt. Progress in 
Math., V. 128. Birkhauser Boston, 1995, xviii 
+ 310 pp, $89. [ISBN 0-8176-5174-8] The 
systems studied are homomorphisms of Z% to 
the automorphisms of compact (usually abelian) 
groups. SK 


Dynamical Systems, P. Algorithms, Fractals, 
and Dynamics. Ed: Y. Takahashi. Plenum Pr, 
1995, viii + 227 pp, $85. [ISBN 0-306-45127- 
1] Papers from symposia held in 1995 at the 
Fujisaki Institute of Hayashibara Biochemical 
Laboratories, Inc., and at Kyoto University. 


Dynamical Systems, P. Dynamics in Several 
Complex Variables. John Erik Forness. CBMS 
Reg. Conf. Ser. in Math,, No. 87. AMS, 1996, 
vii + 59 pp, $19 (P). [ISBN 0-8218-0317-4] 
Ten brief, clear, readable lectures (from a 1994 
CBMS conference held in Albany) comprise 
an introduction to recent literature and open 
problems in higher-dimensional complex dy- 
namics. Lectures offer motivation rather than 
complicated proofs—the author believes that 
“mathematicians can read arbitrarily compli- 
cated material [for themselves] once they are 
motivated.” PZ 


Dynamical Systems, T(14-15: 1), C, L. A 
First Course in Discrete Dynamical Systems, 
Second Edition. Richard A. Holmgren. Uni- 
versitext. Springer-Verlag, 1996, xv + 223 pp, 
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$29.95 (P). [ISBN 0-387-94780-9] Techni- 
cal, yet accessible introduction to iterated func- 
tions; requires only calculus. Reorganization of 
First Edition (TR, January 1995) makes metric 
spaces and symbolic dynamics optional. RM 


Operator Theory, P. Integral Equations with 
Difference Kernels on Finite Intervals. Lev A. 
Sakhnovich. Operator Theory: Adv. & Applic., 
V. 84. Birkhauser Boston, 1996, vi + 177 pp, 
$94. [ISBN 0-8176-5267-1] 


Operator Theory, P. Toeplitz Operators and 
Index Theory in Several Complex Variables. 
Harald Upmeier. Operator Theory: Adv. & Ap- 
plic., V. 81. Birkhauser Boston, 1996, 481 pp, 
$161. [ISBN 0-8176-5282-5] 


Operator Theory, P. Functional Calculus 
of Pseudodifferential Boundary Problems, Sec- 
ond Edition. Gerd Grubb. Prog. in Math., 
V. 65. Birkhauser Boston, 1996, viii + 522 pp, 
$89.50. [ISBN 0-8176-3738-9] First Edition, 
TR, November 1987. 


Analysis, T*(14: 2), L*. Mathematical Analy- 
sis: An Introduction. Andrew Browder. Under- 
grad. Texts in Math. Springer-Verlag, 1996, xiv 
+ 333 pp, $39. [ISBN 0-387-94614-4] First 
third is a careful development of standard topics 
from calculus including series, continuity, and 
Riemann integration. Middle third is topology 
with applications (e.g., geodesics in compact 
metric spaces). Final third is about calculus 
on manifolds, including differential forms, and 
Brouwer’s fixed point theorem. An impressive 
list of topics in a small, reasonably priced book. 
Many exercises and a useful bibliography. TAV 


Analysis, P. Lecture Notes in Control and In- 
formation Sciences—219: ICASO ’96: Images, 
Wavelets and PDEs. Eds: Marie-Odile Berger, 
et al. Springer-Verlag, 1996, xv + 359 pp, 
$63 (P). [ISBN 3-540-76076-8] Proceedings 
of the 12th International Conference on Analy- 
sis and Optimization of Systems (June 1996) in 
Paris, France. 


Algebraic Geometry, P. Algorithms in Alge- 
braic Geometry and Applications. Eds: Lau- 
reano Gonzalez-Vega, Tomas Recio. Progress 
in Math., V. 143. Birkhauser Boston, 1996, ix 
+ 399 pp, $98.50. [ISBN 0-8176-5274-4] 20 
papers from the MEGA-94 Conference at the 
University of Cantabria (Spain). 

Algebraic Geometry, P. Moduli of Vector Bun- 
dles. Ed: Masaki Maruyama. Lect. Notes in 
Pure & Appl. Math., V. 179. Marcel Dekker, 
1996, viii + 305 pp, $135 (P). [ISBN 0-8247- 
9738-8] 20 papers from the 35th Taniguchi 
International Symposium (1994). 


Differential Geometry, P.. Fundamental 
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Groups of Compact Kahler Manifolds. J. 
Amorés, et al. Math. Surveys & Mono., V. 44. 
AMS, 1996, xi+ 140 pp. [ISBN 0-8218-0498-7] 


Differential Geometry, P. Manifolds and Ge- 
ometry. Eds: Paolo de Bartolomeis, Franco 
Tricerri, Edoardo Vesentini. Cambridge Univ 
Pr, 1996, ix + 321 pp, $59.95. [ISBN 0-521- 
56216-3] Proceedings of a 1993 conference 
held in Pisa to honor Eugenio Calabi. 


Differential Geometry, T(17-18: 1), P. Rie- 
mannian Geometry and Geometric Analysis. 
Jiirgen Jost. Springer-Verlag, 1995, xi+ 401 pp, 
$54 (P). [ISBN 0-387-57113-2] Introductory 
text on geometric and analytic methods in study 
of Riemannian manifolds. Reasonably self- 
contained. Nonlinear analysis techniques are 
introduced early and used throughout. Uses 
both invariant global and tensor notation. RM 


Differential Geometry, T(17—18: 1), S, P. An 
Introduction to Lorentz Surfaces. ‘Tilla We- 
instein. Expos. in Math., V. 22. Walter de 
Gruyter, 1996, xiii + 213 pp, DM 168. [ISBN 
3-11-014333-X] Lorentz surfaces (manifolds 
provided with set of metrics conformally equiv- 
alent to an indefinite Lorentz metric) are more 
subtle and complicated than their Riemann sur- 
face analogs, and have emerged as a tool in rela- 
tivity theory. Introduction to current research in 
2-dimensional Lorentz geometry, comparison 
with Euclidean and Minkowski 3-spaces. RM 


Geometry, T(16), P, L. Minkowski Geome- 
try. A.C. Thompson. Ency. of Math. & Its 
Applic., V. 63. Cambridge Univ Pr, 1996, xvi + 
346 pp, $59.95. [ISBN 0-521-40472-X] Cov- 
erage of topological properties of Minkowski 
Spaces (1.e., finite dimensional normed linear 
spaces), characterizations of Euclidean space 
among normed spaces, area and volume in 
normed spaces, and trigonometry in Minkowski 
spaces. Chapters end with historical notes. 
Lists 50 unsolved problems. JNC 


Topology, P. Compact Projective Planes With 
an Introduction to Octonion Geometry. Helmut 
Salzmann, et al. Expos. in Math., V. 21. Walter 
de Gruyter, 1995, xi11+ 688 pp, DM 258. [ISBN 
3-11-011480-1] 40 years of research on topo- 
logical projective planes, collineation groups, 
and incidence geometry. Approach connects 
group theory to geometry. JD 

Topology, P. Categorical Topology. Ed: Eraldo 
Giuli. Kluwer Academic, 1996, v + 280 pp, 
$130. [ISBN 0-7923-4049-3] Proceedings of 
a 1994 workshop at the University of L Aquila. 
Some of the papers are reprinted from Applied 
Categorical Structures 4 (1996). 


Operations Research, P, L. Queueing Sys- 
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tems: Problems and Solutions. Leonard Klein- 
rock, Richard Gail. Wiley, 1996, ix + 227 pp, 
$34.95 (P). [ISBN 0-471-55568-1] All the 
problems from Queueing Systems, Volume 1: 
Theory by L. Kleinrock (Wiley-Interscience, 
1976) and their solutions. A brief, introduc- 
tory section is “A Queueing Theory Primer.” 


Optimal Control, P. Mathematical Theory of 
Control Systems Design. V.N. Afanas’ev, V.B. 
Kolmanovskii, V.R. Nosov. Math. & Its Ap- 
plic., V. 341. Kluwer Academic, 1996, xxiii + 
668 pp, $279. [ISBN 0-7923-3724-7] 


Probability, T(15-16: 1). Probability and In- 
formation: An Integrated Approach. David Ap- 
plebaum. Cambridge Univ Pr, 1996, xiii + 
212 pp, $24.95 (P); $69.95. [ISBN 0-521- 
55528-0; 0-521-55507-8] Concise presenta- 
tion of core material on both probability and 
information; defines probability as a measure 
on a Boolean algebra. Suggestions for further 
reading conclude each chapter. RSK 


Elementary Statistics, T(13—14: 1, 2). Gen- 
eral Statistics, Third Edition. Warren Chase, 
Fred Bown. Wiley, 1997, xv + 713 pp, $67.95. 
[ISBN 0-471-05584-0] Changes (Second Edi- 
tion, TR, February 1993) include more use of 
exploratory data analysis, an earlier introduc- 
tion to descriptive aspects of regression and cor- 
relation, a re-worked probability chapter; vari- 
ance tests now relegated to an appendix. In- 
cludes data from the Framingham Heart Study, 
and problem sets based on it. Examples and 
exercises include output from a variety of sta- 
tistical packages. RSK 


Mathematical Statistics, T(18: 1, 2), P. Mul- 
tivariate Statistical Analysis. Narayan C. Giri. 
Stat.: Textbooks & Mono., V. 149. Marcel 
Dekker, 1996, xii + 378 pp, $135. [ISBN 0- 
8247-9338-2] Culmination of the author’s ex- 
tensive research and teaching experience. Em- 
phasizes the invariance approach. Primarily 
theoretical; few examples and exercises based 
on real data. RSK 


Mathematical Statistics, T(16-17: 2, 3). 
Probability and Statistical Inference. Robert 
Bartoszynski, Magdalena Niewiadomska-Bugaj. 
Ser. in Prob. & Stat. Wiley, 1996, xvi + 826 pp, 
$59.95. [ISBN 0-471-31073-5] Solid text 
stressing comprehension over skill acquisition. 
Contains no computer references and does not 
talk down to students. Includes much extra ma- 
terial for mathematically-minded students, such 
as Lebesgue and Stieltjes integrals and chapters 
on Markov chains and discrimination. RSK 


Statistical Methods, P. MS/-2000: Multi- 
variate Statistical Analysis in Honor of Profes- 
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sor Minoru Siotani. Eds: Takesi Hayakawa, 
Makoto Aoshima, Kunio Shimizu. Amer. 
Journ. of Math. & Management Sci., V. 15, 
Nos. 3 & 4. 1995, 237 pp, $125 (P). [ISBN 
0-935950-38-9] Proceedings of a 1995 con- 
ference at the University of Hawaii. 


Programming, P. Java in a Nutshell: A Desk- 
top Quick Reference for Java Programmers. 
David Flanagan. O’Reilly & Associates, 1996, 
xix + 438 pp, $14.95 (P). [ISBN 1-56592-183- 
6] 


Computer Systems, P. CG/ Programming on 
the World Wide Web.  Shishir Gundavaram. 
O’Reilly & Associates, 1996, xiv + 433 pp, 
$29.95 (P). [ISBN 1-56592- 168-2] 


Computer Systems, P, L. HTML: The Defini- 
tive Guide. Chuck Musciano, Bill Kennedy. 
O’Reilly & Associates, 1996, xx + 385 pp, 
$27.95 (P). [ISBN 1-56592-175-5] 


Computer Science, T(15: 1). An Introduc- 
tion to High-Performance Scientific Computing. 
Lloyd D. Fosdick, et al. Sci. & Eng. Comput. 
MIT Pr, 1996, xxiii + 760 pp, $55. [ISBN 0- 
262-06181-3] Concrete introduction to tools 
(including UNIX, Fortran, MATLAB, etc.), al- 
gorithms, and applications of high-performance 
computing. Applications include molecular dy- 
namics, advection, and tomography. JO 


Computer Science, T(15-16: 1), S, L. AI- 
gebraic Semantics of Imperative Programs. 
Joseph A. Goguen, Grant Malcolm. Found. 
of Comp. Ser. MIT Pr, 1996, ix + 228 pp, $32. 
[ISBN 0-262-07172-X] Nice introduction to 
science of reasoning about programs. Goals: 
enable undergraduates to better understand the 
semantics of programs; develop intuitions about 
programming; rigorously verify properties of 
programs. Algebraic variation of the axiomatic 
approach; uses equational logic and OBJ (func- 
tional metalanguage, with equations as state- 
ments, proofs as computations) for expressing 
and proving program properties. RM 


Applications (Engineering), T(17: 2). Math- 
ematical Analysis in Engineering: How to Use 
the Basic Tools. Chiang C. Mei. Cambridge 
Univ Pr, 1995, xvii + 461 pp. [ISBN 0-521- 
46053-0] Includes all standard introductory 
topics (e.g., Fourier series, Green’s functions, 
transforms) as well as chapters on perturbation 
methods and symbolic computation. Topics 
are introduced and explained through physical 
problems and examples. SK 


Applications (Fluid Mechanics), P. Advances 
in Multi-Fluid Flows. Eds: Yuriko Y. Re- 
nardy, et al. SIAM, 1996, xvi + 432 pp, 
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$75 (P). [ISBN 0-89871-377-3] Proceedings 
of the 1995 Conference on Multi-Fluid Flows 
and Interfacial Instabilities at the University of 
Washington. 


Applications (Mechanics), P. Hamiltonian 
Dynamics and Celestial Mechanics. Eds: Don- 
ald G. Saari, Zhihong Xia. Contemp. Math., 
V. 198. AMS, 1996, vii + 240 pp, $72 (P). 
[ISBN 0-8218-0566-5] Proceedings of a 1995 
AMS-IMS-SIAM Joint Summer Research Con- 
ference at the University of Washington. 


Applications (Systems Theory), P. Control 
of Uncertain Sampled-Data Systems. Geir E. 
Dullerud. Systems & Control: Found. & Ap- 
plic. Birkhauser Boston, 1996, xiv + 177 pp, 
$42.50 (P). [ISBN 0-8176-3851-2] 


Applications (Systems Theory), P. Mod- 
elling and Optimization of Distributed Param- 
eter Systems Applications to Engineering. Eds: 
Kazimierz Malanowski, Zbigniew Nahorski, 
Matgorzata Peszyfska. Chapman & Hall, 1996, 
X + 387 pp, $100. [ISBN 0-412-72700-5] Pa- 
pers from a 1995 conference held in Poland. 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
214: Recent Advances in Control and Op- 
timization of Manufacturing Systems. Eds: 
George Yin, Qing Zhang. Springer-Verlag, 
1996, 222 pp, $54 (P). [ISBN 3-540-76055- 
5] 8 papers on optimal production planning, 
scheduling and improvability, and approximate 
optimality and robustness. 


Applications, S, P*. A Probabilistic Analy- 
sis of the Sacco and Vanzetti Evidence. Joseph 
B. Kadane, David A. Schum. Ser. in Prob. & 
Stat. Wiley, 1996, xvi + 366 pp, $49.95. [ISBN 
0-471-14182-8] Uses an elaborate case study 
(the celebrated murder trial and appeals of an- 
archists Sacco and Vanzetti, convicted and exe- 
cuted) to show “how modern probabilistic meth- 
ods can be employed in the study of complex 
inferences based on masses of evidence.” In- 
cludes synopses of 28 Wigmore trial and post- 
trial evidence charts given in an appendix, and 
probabilistic analyses of the evidence. Thor- 
ough discussion of all aspects of the method- 
ology. Conclusion: Vanzetti innocent, Sacco’s 
guilt not proven. RSK 


Reviewers 


DB: David Bressoud, Macalester; JNC: Judith N. Ceder- 
berg, St. Olaf; PF: Paul Froeschl, Macalester; TH: Tom 
Halverson, Macalester; DH: Deanna Haunsperger, Car- 
leton; SK: Steve Kennedy, Carleton; RSK: Richard S. 
Kleber, St. Olaf; RM: Richard Molnar, Macalester; JO: 
Jeff Ondich, Carleton; JS: John Schue, Macalester; TAV: 
Theodore A. Vessey, St. Olaf; PZ: Paul Zorn, St. Olaf. 


189 


THE AUTHORS 


BRYAN SHADER received his Ph.D. from the University of Wisconsin -Madison in 1990. He wrote his 
dissertation on tournament matrices under the direction of Richard Brualdi. He was a post-doctoral 
fellow at the Institute for Mathematics and its Applications during their year on Applied Linear 
Algebra in 1991. He is an Associate Professor of Mathematics at the University of Wyoming. His 
primary research interests involve combinatorics and linear algebra. 


CHANYOUNG LEE SHADER received her Ph.D. from the University of Wisconsin -Madison in 1992, 
Her dissertation was supervised by Georgia Benkart and concerned the representation theory of Lie 
algebras. She is an Assistant Professor of Mathematics at the University of Wyoming. Her current 
research interests lie in the use of the combinatorics of partitions and tableaux in representation theory 
of Lie superalgebras. 


RALPH H. BUCHHOLZ spent most of his youth in Teralba with numerous sojourns to the surrounding 
beaches to bodysurf. In 1981 he received an Honours degree in Mathematics from Newcastle Univer- 
sity. Ralph then became the Distribution Analyst of the Natural Gas Company. Four years later the 
lure of research drew him back to Newcastle University and in 1989 he completed a Ph.D. in 
Mathematics. Since 1991 he has worked for the Crypto-Mathematics Research Group while maintain- 
ing an avid interest in the theory of elliptic curves, the music of Kate Bush, and the Stone of Rosetta. 


RANDALL L. RATHBUN has been fascinated by mathematics since first reading in the seventh grade 
how the Lehmers factored large numbers. He credits Martin Gardner for steering his interests towards 
the rational cuboid problem and especially thanks his coauthor and Richard K. Guy for persuading him 
to obtain his bachelors degree in mathematical sciences at California State University San Marcos in 
1994. He has taught science lessons for the K.I.D.S. program at an elementary school and algebra for 
the NSF Math Renaissance at a middle school. He hopes to awaken a triangle mini-renaissance in the 
math community and obtain a masters degree shortly. 


TAO KAI LAM received his Ph.D. from MIT under Richard P. Stanley in 1995. He did his undergradu- 
ate in the National University of Singapore, where he is now a lecturer. His area of research is 
combinatorics. Born in Singapore in 1966, he found out that he had spent 21 years of his life studying in 
one school or another. This makes the transition from one end of the lecture hall to the other a big 
challenge. He now indulges in chewing gum whenever he is out of the country. 


RONALD PRATHER completed his Ph.D. in 1969 at Syracuse University. He has been a member of the 
faculty at the University of Vermont, The University of Denver, and Syracuse University. He holds the 
Caruth Distinguished Professorship in Computer Science at Trinity University in San Antonio, TX. His 
research interests include programming languages and software metric theory. He is also the moderator 
of the Alec Wilder Mailing List on the Internet, a group devoted to discussions of the life and work of 
this uncommon American composer. 


R. BRUCE RICHTER is interested in graph theory, combinatorics, and the topology of surfaces. He 
received his Ph.D. in 1983 from the University of Waterloo, Canada and has taught at Utah State 
University, The Ohio State University, and The U.S. Naval Academy. He has been at Carleton since 
1988. 


CARSTEN THOMASSEN received his Master’s degree in 1972 at University of Aarhus, Denmark, and 
his Ph.D. in 1976 at University of Waterloo, Canada. Since 1981 he has been a professor of mathemat- 
ics at the Technical University of Denmark. He is coeditor-in-chief of the Journal of Graph Theory and 
he is a member of the Royal Danish Academy of Sciences and Letters. 


JOHN HOLTE liked to add long columns of numbers when he was in first grade. Subsequently he 
earned his B.S. in mathematics at Caltech and his Ph.D. at the University of Wisconsin. He has taught 
at Augustana College -Rock Island, Rensselaer Polytechnic Institute, and Gustavus Adolphus College. 
His research interests lie in probability theory, but after teaching courses in applied combinatorics and 
fractal geometry and chairing the 1990 Nobel Conference on chaos, he began investigating a multifrac- 
tal description of multinomial coefficient divisibility and stumbled upon the “amazing matrix.” 


WALTER NEF was born in 1919 in Winterthur. He received his Ph.D. in Mathematics from the 
University of Zurich in 1942. From 1944 to 1948 he taught at the University of Fribourg, with a break in 
1946 for a stay at Brown University. In 1948 he was offered a chair at the University of Bern. In 1956/7 
he visited the National Bureau of Standards in D.C., where he got acquainted with applied mathematics 
and computer sciences. In Bern he founded the Institute for Applied Mathematics, which he headed 
until 1979. He became emeritus in 1984. His fields of interest are combinatorial and computational 
geometry. 


RICHARD GUY’s only formal mathematical education was at Warwick School and Cambridge Univer- 
sity. He has taught at all levels from kindergarten to postgraduate, and in many parts of the world: 
Britain, Singapore, India, Canada. Embarrassed by having a senior professor without a doctorate, The 
University of Calgary gave him an honorary degree in 1991. He still visits the mountains, both summer 
and winter. 


190 THE AUTHORS [February 


NEW AND NUTHOLE 


HTHEMATIC TILED 


THE MATHEMATICA PROGRAMMER I 


Roman Maeder 


N F This book is a second volume to follow The Mathematica 
ew: Programmer (Academic Press, 1993) and is compatible _ 


MATHEMATICA 


INFFERENTIAI ~ _ with the latest release of Mathematica, version 3.0. 
The volume also includes a CD-ROM compatible with 

~~ both Macintosh and Windows which contains updated 
(l) ATIONS programs from the first and second volumes, as well 


links to all relevant information. 


HITH MATHEMATICA. 


~ SECOND EDITION 
Martha Abell and James 


ee Second Edition of the groundbrea ng Differentis 


. SBN: 0-12-464992-0 
emark of Wolfram Research, Inc. 


V All applications we 
S cco SS 


¥ Pocuses on the , 
Paperback: $39,95 
S wanuary 1997, c. 54d 


Patrick Tam 
ts rays Guide to Ms the 
ath nati 


7 A P RIMER ( oF 


HP BOAS IR 


- "FOURTH EDITION © 


"The Carus Mathernatical Monggraphs, | 


- Nurnber 13 


This is a revised, updated and augmented edition of 
a classic Carus monograph (a bestseller for over 25 
years) on the theory of functions of a real variable. 
Earlier editions of this classic Carus Monograph cov- 
ered sets, metric spaces, continuous functions, and 
differentiable functions. The fourth edition adds sec- 
tions on measurable sets and functions, the Lebesgue 
and Stieltjes integrals, and applications. The book is 
accessible to readers with some mathematical sophis- 
tication and a background in calculus. It is suitable 
either for self-study or for supplemental reading in a 
course on advanced calculus or real analysis. 


Not intended as a systematic treatise, this book has 
more the character of a sequence of lectures on a 
variety of topics connected with real functions. 
Many of these topics are not commonly encountered 
in undergraduate textbooks: for example, the exis- 
tence of continuous everywhere-oscillating functions 
(via the Baire category theorem); two functions hav- 
ing equal derivatives, yet not differing by a constant; 
application of Stieltjes integration to the speed of 
convergence of infinite series. 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


DQ 
‘eo 


A Primer of Real 
Functions 


“janoib P200 : by Ralph P. Boas 
@ Revised and updated by Harold P. Boas 


Series: Carus Mathematical Monograph 


Table of Contents. 

I. Sets: Sets of real numbers, Countable and uncount- 
able sets, Metric spaces, Open and closed sets, Dense 
and nowhere dense sets, Compactness, Convergence 
and completeness, Nested sets and Baire’s theorem, 
Some applications of Baire’s theorem, Sets of mea- 
sure zero. II. Functions: Functions, Continuous func- 
tions, Properties of continuous functions, Upper and 
lower limits, Sequences of functions, Uniform con- 
vergence, Pointwise limits of continuous functions, 
Approximations to continuous functions, Linear func- 
tions, Derivatives, Monotonic functions, Convex func- 
tions, Infinitely differentiable functions. III. 
Integration: Lebesgue measure, Measurable functions, 
Definition of the Lebesgue integral, Properties of 
Lebesgue integrals, Application of the Lebesgue inte- 
gral, Stieltjes integrals, Applications of the Stieltjes 
integral, Partial sums of infinite series. 


Catalog Code: CAM-13R/JR 

262 pp., Hardcover, 1996 

ISBN 0-88385-029-X 

List: $32.95 MAA Member: $24.95 


Phone in Your Order Now! ®B 1-800-331-1622 


Monday — Friday 8:30 am — 5:00 pm 


FAX (301) 206-9789 


or mail to: The Mathematical Association of America, PO Box 91112, Washington, DC 20090-1112 


Shipping and Handling: Postage and handling are charged as follows USA orders (shipped via UPS): $2 95 for the first book, and $1.00 for each additional book. 
Canadian orders: $4 50 for the first book and $150 for each additional book Canadian orders will be shipped within 10 days of receipt of order via the fastest avail- 
able route We do not ship via UPS into Canada unless the customer specially requests this service Canadian customers who request UPS shipment will be billed an 
additional 7% of their total order Overseas orders $3 50 per item ordered for books sent surface mail Airmail service 1s available at a rate of $7 00 per book. 
Foreign orders must be paid in US dollars through a US bank or through a New York clearinghouse Credit Card orders are accepted for all customers 


City State Zip 


Phone 


QTY. CATALOG CODE PRICE AMOUNT 
CAM-13R/JR 

All orders must be prepaid with the — Shipping & handling 

exception of books purchased for 

resale by bookstores and wholesalers. TOTAL 

Payment [] Check (J VISA LJ MasterCard 

Credit Card No. Expires J 


Signature 


This book contains the best problems selected 
from over 25 years of the Problem of the Week 
at Macalester College. Readers will find here a 
collection of intriguing and thought provoking 
problems that will give students (high school or 
beyond), teachers, and university professors a 
chance to experience the pleasure of wrestling 
with some beautiful problems of elementary 
mathematics. 


Compare your sleuthing talents with those of 
Sherlock Holmes, who made a bad mistake 
regarding the first problem in the collection: 
Determine the direction of travel of a bicycle 
that has left its tracks in a patch of mud. The 
collection contains a variety of other unusual 
and interesting problems in geometry, algebra, 
combinatorics and number theory. For exam- 
ple, if a pizza is sliced into eight 45-degree 


THE MATHEMATICAL ASSOCIATION OF AMERICA Qe 


Which Way did the 
Bicycle Go? 


and Other Intriguing Mathematical Mysteries 


Joseph D. E. Konhauser, Dan Velleman, and Stan Wagon 


Series: Dolciani Mathematical Expositions 


wedges meeting at a point other than the center 
of the pizza, and two people eat alternate 
wedges, will they get equal amounts of pizza? 
Or: What is the rightmost nonzero digit of the 
product 1° 2° 3°: 1000000? Or: Is a manufac- 
turer’s claim that a certain unusual combination 
lock allows thousands of combinations justified? 


Complete solutions to the 191 problems are 
included along with problem variations and 
topics for investigation. This collection will be 
especially valuable to teachers who are looking 
for stimulating ways to engage their students 
with the beauty and intrigue that can often be 
found in elementary mathematics. 


Catalog Code: DOL-18/JR 
236 pp., Paperbound, 1996, ISBN 0-88385-325-6 
List: $24.95 MAA Member: $19.95 


Phone in Your Order Now! ®B 1-800-331-1622 


Monday — Friday 8:30 am — 5:00 pm 
or mail to: The Mathematical Association of America, PO Box 91112, Washington, DC 20090-1112 


FAX (301) 206-9789 


Shipping and Handling: Postage and handling are charged as follows: USA orders (shipped via UPS): $2.95 for the first book, and $1.00 for each additional book. 
Canadian orders: $+ 50 for the first book and $1 50 for each additional book Canadian orders will be shipped within 10 days of receipt of order via the fastest avail- 
able route We do not ship v1a UPS into Canada unless the customer specially requests this service. Canadian customers who request UPS shipment will be billed an 
additional “% of their total order Overseas orders: $3 50 per item ordered for books sent surface mail. Airmail service is available at a rate of $7.00 per book. 
Foreign orders must be paid in US dollars through a US bank or through a New York clearinghouse. Credit Card orders are accepted for all customers. 


Address 


City State Zip 


Phone 


QTY. CaTALOG CODE PRICE AMOUNT 
DOL-18/JR 

All orders must be prepaid with the — Shipping & handling 

exception of books purchased for 

resale by bookstores and wholesalers. TOTAL 

Payment [LL] Check [LJ VISA LJ MasterCard 

Credit Card No. Expires J 


Signature 


Laboratory 
Experiences — 
in Group 
Theory 


Elien Maycock Parker 


The Methemuticel Asseciation ot Amnri¢d 


A lab manual with software for introductory courses 
in group theory or abstract algebra 


Laboratory Experiences in Group Theory is a workbook 
of 15 laboratories designed to be used with the software 
Exploring Small Groups as a supplement to the regular 
textbook in an introductory course in group theory or 
abstract algebra. Written in a step-by-step manner, the 
laboratories encourage students to discover the basic 
concepts of group theory and to make conjectures from 
examples that are easily generated by the software. 
The labs can be assigned as homework or can be used 
in a structured laboratory setting. Since the software is 
user-friendly and the laboratories are complete, stu- 
dents and faculty should have no difficulty in using the 
labs without training. 


Most students find that the laboratories provide an 
enjoyable alternative to the “theorem-proof-example” 
format of a standard abstract algebra course. At the end 
of the semester, one student wrote in his evaluation of 
the course: 


I am truly grateful for the laboratory component...Work 
on the computer helped to make the abstract theory 
more concrete... One of the best things about the labs 
was that we formed our own conjectures about the pat- 
terns we saw...I believe that the progression of (1) lab, 
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Laboratory Experiences 
in Group Theory 


A Manual to be Used with 
Exploring Small Groups 


Ellen Maycock Parker 


Series: Classroom Resource Materials 


(2) conjecture, (3) class discussion, and (4) proof was 
highly beneficial in gaining understanding of the 
abstract material of the course. 


Table of Contents: 1. Groups and Geometry; 2. Cayley 
Tables, 3. Cyclic Groups and Cyclic Subgroups, 4. 
Subgroups and Subgroup Lattices, 5. The Center and 
Commutator Subgroups; 6. Quotient Groups; 7. Direct 
Products, 8. The Unitary Groups; 9. Composition 
Series; 10. Introduction to Endomorphisms,; 11. The 
Inner Automorphisms of a Group; 12. The Kernel of an 
Endomorphism, 13. The Class Equation; 14. Conjugate 
Subgroups, 15. The Sylow Theorems; Appendix A. 
Table Generation Menu of Exploring Small Groups 
(ESG), Appendix B. Sample Library of ESG; Appendix 
C. Group Library of ESG; Appendix D. Group 
Properties Menu 


Exploring Small Groups, the software packaged with 
this lab manual, is on a 31/2” DD PC compatible disk. 
This is a DOS program that can be run in Windows. 
The software was developed by Ladnor Geissinger, 
University of North Carolina at Chapel Hill. 


112 pp., Paperbound, 1996 

ISBN 0-88385-705-7 

List: $22.00 MAA Member: $16.00 
Catalog Code: LABEJR 


ORDER FROM: 
THE MATHEMATICAL ASSOCIATION. OF AMERICA 
PO Box 91112, Washington, DC 20090-1112 


1-800-331-1622 


Zip 


(301) 617-7800 FAX (301) 206-9789 


QTY. CATALOG CODE PRICE AMOUNT 

_—sC LABE/JR — 
TOTAL 

Payment (J Check (0 VISA OC MasterCard 

Credit Card No. Expires. /_ 

Signature 


Careers in 


Mathematics 


Read the biographical essays written by individ- 
uals who have gotten exciting good-paying jobs 
by preparing themselves with a solid back- 
ground in the mathematical sciences. It will 
provide you and your students with a wealth of 
information about the types of different career 
paths that can be chosen for those who are 
well-prepared in mathematics. 


These mathematicians are found: 

e in well-known companies such as IBM, 

AT&T, and American Airlines, 

e in some surprising places like FedEX 
Corporation, L. L. Bean, Perdue Farms, 
in government agencies 
in the arts (sculpture, music, and television), 
in the professions (law and medicine), and 
in education (elementary, secondary, college 
and university) 
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101 Careers in 
Mathematics 


Andrew Sterrett, Editor 


Series: Classroom Resource Materials 


A career guide 
for your students. 
lf they want to know 
why they should 
study mathematics, 
thls book will tell 
them. 


Many of these individuals have started their 
own companies. 


Your students will see how these individuals use 
their mathematical sciences training on a daily 
basis in their work, often relying on the general 
problem-solving skills they have acquired in 
their mathematics courses. Those who studied 
statistics and computer science as well as mathe- 
matics, tell how their training in these disciplines 
helped them advance in their careers. 


Articles in the Appendix reprinted from the 
MAA’s magazine for students, Math Horizons, 
provide valuable advice on looking for a job 
and the expectations of industry. 
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ALD CAEN E 


The use of the history of mathematics in the teaching 
of mathematics at all levels is an idea whose time has 
come. To use history in the teaching of undergradu- 
ate mathematics, the instructor must be familiar with 
the history as well as the mathematics. Vita 
Mathematica will enable college teachers to learn the 
relevant history of various topics in the undergradu- 
ate curriculum and help them incorporate this history 
in their teaching. 


For example, should calculus be approached from a 
geometric or an algebraic point of view? The book 
shows us how two important eighteenth century 
mathematicians, Colin Maclaurin and Joseph-Louis 
Lagrange, understood the calculus from these differ- 
ent standpoints and how their legacy is still impor- 
tant in teaching calculus today. We also learn why 
Lagrange’s algebraic approach dominated teaching in 
Germany in the nineteenth century. Some of the rea- 
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Vita Mathematica 


Historical Research and Integration with Teaching 


Ronald Calinger, Editor 


sons for this are related to the appropriate founda- 
tions of the calculus, and so the book traces the 
ancient history of one of the possible foundations, 
the concept of indivisibles. Even though we general- 
ly do not use this concept formally today, many ideas 
for a heuristic approach to the calculus can be devel- 
oped out of his study. 


Vita Mathematica contains numerous other articles 
dealing with calculus, with algebra, combinatorics, 
graph theory, and geometry, as well as more general 
articles on teaching courses for prospective teachers. 
This volume, then, demonstrates that the history of 
mathematics is no longer tangential to the mathemat- 
ics curriculum, but in fact deserves a central role. 
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Lion Hunting 


and Other Mathematical Pursuits 


A Collection of Mathematics, Verse, and Stories 


by Ralph P. Boas, Jr. 


Gerald L. Alexanderson and 
Dale H. Mugler, Editors 


I highly recommend Lion Hunting and Other 
Mathematical Pursuits to high school mathematics 
clubs, mathematics teachers of all levels, and anyone 
interested in mathematics. Perhaps the most impor- 
tant features of this book is how it subtly makes the 
reader aware of the nature of mathematics. 


— The Mathematics Teacher 


As a young man at the Institute for Advanced Study in 
Princeton, Ralph Philip Boas, Jr., together with a group of 
other mathematicians, published a light-hearted article on 
the “mathematics of lion hunting” under a pseudonym 
(1938). This sparked a sequence of articles on the topic, 
several of which are drawn together in this book. 


Lion Hunting includes an assortment of articles that show 
the many facets of this remarkable mathematician, editor, 
writer, and teacher. Along with a variety of his lighter 
mathematical papers, the collection includes Boas’ verse 
and short stories, many of which are appearing for the first 
time. Anecdotes and recollections of his numerous experi- 
ences and of his work and meetings with many distin- 
guished mathematicians and scientists of his day are also 
included as well as photographs taken by Boas of Hardy, 
Littlewood, Besicovitch, Weil, and others. 


The mathematical articles in this collection cover a range 
of topics. They include articles on infinite series, the mean 
value theorem, indeterminate forms, complex variables, 
inverse functions, extremal problems for polynomials and 
more. A special section of this book is devoted to articles 
about the teaching of mathematics, with titles such as 


LION HUNTING & OTHER 
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Lratene 


“Calculus as an experimental science” and “Can we make 
mathematics intelligible?” 


Boas’s wit and playful humor are reflected in the verses 
included in this collection. The verses reflect the phases of 
his career as author, editor, teacher, department chair, and 
lover of literature. A section of the book describes the feud 
that Boas supposedly had with Bourbaki. Also included are 
many amusing anecdotes about famous mathematicians. 
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mathematics 


Constance Reid 


Constance Reid, an established writer about mathemati- 
cians, has written an excellent and loving book, about her 
sister Julia Robinson, the mathematician. The author has 
written that she wants the book to be one for all age 
groups and she has succeeded admirably in making it 
so.. Julia wanted to be known as a mathematician, not a 
woman mathematician and rightly so! However, she was, 
and is, a wonderful role model for women aspiring to be 
mathematician. What a great gift this book would be! 

— Alice Schafer, Former President, AWM 


This book is a small treasure, one which I want to share 
with all my mathematical friends. The assembly of sev- 
eral articles and additional photos and remarks provides 
the image of a mathematician of extraordinary taste, 
tenacity and generosity.... Julia Robinson broke ground 
in displaying the deep connections between number the- 
ory and logic. Her results have led to a very active area 
today, making the appearance of this book very timely. 
Her work and her example are however timeless and I 
can think of no better advice to give a young mathe- 
matician, either in how to do mathematics. or how to 
behave in mathematics, than: “Be like Julia!” 

—Carol Wood, Deputy Director, MSRI 


In high school Julia Bowman stood alone as the 
only girl—and the best student—in her junior and 
senior math classes. She had only one close friend 


and no boyfriends. Although she was to learn (from 
E. T. Bell’s Men of Mathematics) that there are such 
people as mathematicians, her ambition was merely 
to get a job teaching mathematics in high school. 


At great sacrifice her widowed stepmother sent her 
to the University of California at Berkeley to obtain 
the necessary teaching credentials. But at Berkeley, 
in a society of mathematicians, she discovered her- 
self. She was not the duckling that didn’t belong, but 
a swan. There was also a prince at Berkeley, a bril- 
liant young assistant professor named Raphael 
Robinson. Theirs was to be a marriage that would 
endure until her death in 1985. 


Julia is the story of the life of Julia Bowman Robinson, 
the gifted and highly original mathematician who dur- 
ing her lifetime was recognized in ways that no other 
woman mathematician had been recognized up to that 
time. In 1976 she became the first woman mathemati- 
cian elected to the National Academy of Sciences and 
in 1983 the first woman elected president of the 
American Mathematical Society. 


This unusual book, profusely illustrated with previ- 
ously unpublished personal and mathematical memo- 
rabilia, brings together in one volume the prizewin- 
ning “Autobiography of Julia Robinson” by her sister, 
the popular mathematical biographer Constance 
Reid, and three very personal articles about her work 
by outstanding mathematical colleagues. 


All royalties from sales of this book will go to fund a 
Julia Robinson Prize in Mathematics at the high 
school from which she graduated. 
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Algebra and Tiling 


Homomorphisms in the Service of Geometry 


Sherman Stein and Sandor Szab6 


Algebra and Tiling is perfect for bringing alive an 
abstract algebra course. Intuitive but difficult problems 
of geometry are translated into algebraic problems more 
amenable to solution. Fell of nice surprises, the book is a 
pleasure to read. 

—Choice 


Often questions about tiling space or a polygon lead to 
other questions. For instance, tiling by cubes raises 
questions about finite abelian groups. Tiling by tripods 
or crosses raises questions about cyclic groups. From 
tiling a polygon with similar triangles, it is a short step 
to investigating automorphisms of real or complex 
fields. Tiling by triangles of equal areas soon involves 
Sperner’s lemma from topology and valuations from 
algebra. 


The first six chapters of Algebra and Tiling form a self- 
contained treatment of these topics, beginning with 
Minkowski’s conjecture about lattice tiling of Euclidean 
space by unit cubes, and concluding with Laczkowicz’s 
recent work on tiling by similar triangles. The conclud- 
ing chapter presents a simplified version of Rédei’s theo- 
rem on finite abelian groups: if such a group is factored 
as a direct product of subsets, each containing the identi- 
ty element, and each of prime order, than at least one of 
them is a subgroup. A remarkable geometric implication 
of this result is developed in Chapter 2. 


Algebra and Tiling is accessible to undergraduate mathe- 
matics majors, as most of the tools necessary to read the 
book are found in standard upper division algebra cours- 
es, but teachers, researchers and professional mathemati- 
cians will find the book equally appealing. Beginners will 
find the exercises and the material found in the appen- 
dices especially useful. The “Problems” section will 
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<THE SERVICE OF GEOMETRY 


Sdndor Szab6 


Sherman Stein 


appeal to both beginners and experts in the field. The 
book could serve as the basis of an undergraduate or 
graduate seminar or a source of applications to enrich an 
algebra or geometry course. 


Contents 


Minkowski’s conjecture 

Cubical clusters 

Tiling by the semicross and cross 

Packing and covering by the semicross and cross 
Tiling by triangles of equal areas 

Tiling by similar triangles 

Rédei’s theorem 

Epilog 

Appendices 

References 
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RYPTOLOGY 


Albrecht Beutelspacher 


This fascinating little book is eminently readable, and it 
is a great deal of fun to peruse... the book is a real treat. 
We need more books like this, crafted by expert hands yet 
crafted so that the general reader can enjoy them. 
—Bulletin of The Institute of Combinatorics and 
Its Applications 


This excellent and entertaining book is suitable for a first 
course in cryptology for mathematical enthusiasts. An 
abundance of exercises and an excellent list of related ref- 
erences are included. 

—The Mathematics Teacher 


In spite of the light-hearted style in which the book is 
written throughout, it is a serious—-and successful—- 
attempt to explain the basis of coding and decoding mes- 
sages...I can strongly recommend this book to anyone who 
wants a brief but comprehensive, eminently readable, and 
up-to-date introduction to this increasingly popular topic. 
— The Mathematical Gazette 


All of cryptology is covered in this work...Occupying a 
niche in the halls of the ivory tower of pure mathematics 
for nearly two millennia, number theory now forms a pil- 
lar of modern society. This book is the best explanation 
available today of how that pillar was constructed. 

— Charles Aschbacher 


A model to follow in order to make mathematics better 
known and understood. Accessible to a broad audience. 
Have fun reading this book, while you are getting a better 
understanding of cryptology. 

— Bulletin of the Belgian Mathematics Society 


How can messages be transmitted secretly? How 
can one guarantee that the message arrives safely 


in the right hands exactly as it was transmitted? 
Cryptology—the art and science of “secret writ- 
ing”—provides ideal methods to solve these prob- 
lems of data security. 


The book is fun to read, and the author presents 
the material clearly and simply. Many exercises 
and references accompany each chapter. 
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American Mathematical Society 


Algebra and Geometry: Japanese 
Grade 11 


Kunihiko Kodaira, Gakushuin University, 
Tokyo, Japan, Editor 
Mathematical World; 1996; 174 pages; Softcover; 


ISBN 0-8218-0581-9; List $24; All AMS members $19; 
Order code MAWRLD/10MAA97 


Basic Analysis: Japanese Grade 11 
Kunihiko Kodaira, Editor 
Mathematical World; 1996; 184 pages; Softcover; 


ISBN 0-8218-0580-0; List $24; All AMS members $19; 
Order code MAWRLD/11MAA97 


Mathematics 1: Japanese Grade 10 
Kunihiko Kodaira, Editor 


Mathematical World; 1996; 247 pages; Softcover; 
ISBN 0-8218-0583-5; List $29; All AMS members 
$23; Order code MAWRLD/8MAA97 


Mathematics 2: Japanese Grade 11 


Kunihiko Kodaira, Editor 
Mathematical World; 1996; 262 pages; Softcover; 


ISBN 0-8218-0582-7; List $29; All AMS members $23: 
Order code MAWRLD/9MAA97 


You cannot master mathematics by merely reading 
books and memorizing; you should think through the 
material, do calculations, draw figures, and solve prob- 
lems by yourself. You cannot master swimming by 
reading books about swimming; you must swim in the 
water. —from the Foreword 


The achievement of Japanese high school students 
gained world prominence largely as a result of 
their performance in the International 
Mathematics Studies conducted by the 
International Association for the Evaluation of 
Educational Achievement in the 1960s and 1980s. 


These textbooks are intended to give U.S. educa- 
tors and researchers a first-hand look at the con- 
tent of mathematics instruction in Japan. 


Techniques of Problem Solving 


Steven G. Krantz, Washington University, St. 
Louis, MO 


The purpose of this book is to teach the basic 
principles of problem solving, including both 
mathematical and nonmathematical problems. 
This book will help students to ... 


* translate verbal discussions into analytical 
data. 

* learn problem-solving methods for attacking 
collections of analytical questions or data. 

* build a personal arsenal of solutions and inter- 
nalized problem-solving techniques. 

* become “armed problem solvers”, ready to do 
battle with a variety of puzzles in different 
areas of life. 


Taking a direct and practical approach to the sub- 
ject matter, Krantz’s book stands apart from oth- 
ers like it in that it incorporates exercises through- 
out the text. 


1997; 451 pages; Softcover; ISBN 0-8218-0619-X; List 
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Ramanujan: Letters 
and Commentary 


Bruce C. Berndt, University of 
illinois, Urbana, and 

Robert A. Rankin, University of 
Glasgow, Scotland 


This commendable collection ... is a unique 
contribution to the history of mathematics for at least 
two reasons. It has brought together precious documents 
scattered in many places and provides the reader with 
a wealth of interesting matters related to one of the 
luminaries in the world of mathematics. Second, 
through brief and insightful notes and commentaries, 
the work throws light on many an interesting side 
street connecting to the grand avenue of knowledge on 
which we are riding. With resuscitations of some fading 
photographs and an impressive list of more than 300 
references, this book is a very valuable addition to the 
literature on Ramanujan. —Choice 


The letters that Ramanujan wrote to G. H. Hardy 
on January 16 and February 27, 1913, are two of 
the most famous letters in the history of mathe- 
matics. These and other letters introduced 
Ramanujan and his remarkable theorems to the 
world and stimulated much research, especially 
in the 1920s and 1930s. This book brings together 
many letters to, from, and about Ramanujan. 


Co-published with the London Mathematical Society. 
Members of the LMS may order directly from the 
AMS at the AMS member price. The LMS is registered 
with the Charity Commissioners. 

Customers in India, please contact Affiliated East-West 
Press Private Ltd., 612—A Ornes Road, Kilpauk, 
Madras, 600 010, INDIA; Fax 044-825-7258. 

History of Mathematics; 1995; 347 pages; Hardcover; 
ISBN 0-8218-0287-9; List $59; All AMS members $47; 
Order code HMATH/9MAA97 


A Primer of 
Mathematical Writing 


Steven G. Krantz, Washington University, 
St. Louis, MO 


This book is about writing in the professional 
mathematical environment. While the book is 
nominally about writing, it’s also about how to 
function in the mathematical profession. In many 
ways, this text complements Krantz’s previous 
bestseller, How to Teach Mathematics. Those who 
are familiar with Krantz’s writing will recognize 
his lively, inimitable style. 


In this volume, he addresses these nuts-and-bolts 
issues: 

* Syntax, grammar, structure, and style 

¢ Mathematical exposition 

° Use of the computer and TpX 

E-mail etiquette 

All aspects of publishing a journal article 


Readers will find in reading this text that Krantz 
has produced a quality work which makes evi- 
dent the power and significance of writing in the 
mathematics profession. 
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SPRINGER FOR MATHEMATICS 


ALEXANDER J. HAHN, University of Notre Dame 


Learning Basic Calculus 


From Archimedes to Newton to its Role in Science 


Part I: FROM ARCHIMEDES TO NEWTON develops calculus, as well as the necessary trigonometry 
and analytic geometry, from within the relevant historical context, be it that of the Greek thinkers, 
Galileo, Kepler, Descartes, Leibniz, or Newton. 

Part II: CALCULUS AND THE SCIENCES develops the calculus again, this time in a more rigorous 
way. Comparisons with the approaches of Leibniz and Newton point to the necessity of certain 
theoretical concerns. But the primary purpose of this part is the illustration of the fact that calculus 
informs, enlightens, and gives essential substance to, a wide horizon of disciplines of science, 


engineering, and business. 


For much more information visit:http://www.nd.edu:80/~hahn/ 
1997/APP. 300 PP., 80 ILLUS. /HARDCOVER/$44.50 (TENT.)/ISBN 0-387-94606-3 


TEXTBOOKS IN MATHEMATICAL SCIENCES 


QUANTUM 
The Magazine of Math 
and Science 


Since every issue of Quantum 
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Some Probabilistic Aspects 
of Set Partitions 


Jim Pitman 


1. INTRODUCTION. A partition of the set N, = {1,2,...,} is an unordered 
collection of non-empty subsets of N,. Let P,, denote the set of all such partitions, 
and let B, = #(P,,), the number of partitions of N,. The numbers B, are known 
as Bell numbers after E.T. Bell [3, 4, 45]. See Rota [50] and Gardner [24, Chapter 2] 
for surveys of their properties and applications. The remarkable Dobinski formula 
[18] 

B,=e' ” 


m=1 


(n = 1,2,...) (1) 


m! 


leads [36, 1.9] to the asymptotic evaluation 


1 
B, ~ —— )(n)"*7? eM—n-1 as n > ©, (2) 
nh 


F 


where A(n) log (A(1)) = n. As noted by Comtet [11], for each n the infinite sum in 
(1) can be evaluated as the least integer greater than the sum of the first 2” terms. 

From a probabilistic perspective, the series on the right side of Dobinski’s 
formula represents the nth moment of the Poisson distribution with mean 1. So 
the initially surprising fact that this series yields an integer for all m amounts to the 
fact that all positive integer moments of the Poisson(1) distribution are integers. As 
explained in Section 2, Dobifski’s formula reduces to the fact that the factorial 
moments of the Poisson(1) distribution are identically equal to 1, and this identity 
can be understood probabilistically with essentially no calculation. 

While such probabilistic interpretations of identities related to set partitions are 
the main theme of this paper, Section 1.2 recalls an elementary combinatorial 
proof of Dobinski’s formula. 


k 
partitions of N, into exactly k distinct non-empty subsets, so that 


B= 3 (r (3) 


k=1 


1.1 Notation. Following the notation of [27], let (} denote the number of 


The 1h are known as the Stirling numbers of the second kind. Let m* denote the 


falling factorial with k factors 
m* = m(m — 1):--(m—k + 1), (4) 


which, for positive integers m and k, is the number of permutations of length k of 
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m distinct symbols. The formula 


n 
m= (1 hm (5) 
k=1 
decomposes the number m” of sequences of m distinct symbols of length n as the 
sum over k of the number of such sequences that contain exactly k distinct 
symbols [54, p. 35]. As an identity of polynomials in m of degree n, this identity 
n 
(i 
[11, 47, 48, 54] for background and a wealth of further information about Stirling 

numbers. 


provides an alternative definition of the coefficients for 1<k <n. See 


1.2 A quick proof of Dobinski’s formula. This argument is attributed to Schiitzen- 
berger by Foata [22, p. 73]. Divide (5) by m! to obtain for positive integers m 
and n 


aL (th@pr (6) 


This is the identity of coefficients of A” in the power series identity 


or) m” n 00 di 
> mare {5 {nat La], (7) 
m! 421 \K j=0 j! 
which, upon rearrangement, gives the following horizontal generating function for 
the Stirling numbers of the second kind: 

n oo m” 

y ar =e Y —y". (8) 

ka \K m=1 Mm! 
Now take A = 1 and use (3) to obtain Dobifski’s formula (1). The polynomial 
appearing in (8) is known as an exponential polynomial. Many other proofs of the 
generalization (8) of Dobifski’s formula are known. See for instance Roman [49, 
p. 66] and Wilf [58, p. 106]. Closely related arguments appear in Rota [50], Berge 
[5, p. 44], Comtet [11, p. 211], Lupas [37], and Chen-Yeh [10]. 


2, MOMENTS. For a non-negative integer-valued random variable X with 


P(X =m) =p,, (m =0,1,...) (9) 
and a non-negative function f, let 
El f(X)] = Lenf(m), (10) 


which is the expected value of f(X) for X with distribution (9). See [20, 43] for 
background. From (5) and linearity of the expectation operator E, we obtain the 
following well-known formula for E[X"], the nth moment of X, in terms of E[.X*], 
the kth factorial moment of X for 1 < k < n [47, 14]: 


E[X"] = x (i he[ x4] (11) 


For A > 0, let X, denote a random variable with the Poisson distribution 


m 


A 
P(X, =m) =e (m = 0,1,...) (12) 
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so that 
_y co A™ 
EL A(X)) <e* E fm) —, (13) 
m=0 m!| 
Take f(m) = m” to see that the right side of (8) equals E[_X,’], so the identity (8) 
amounts to the formula 


n 

E[xy] = ars (n = 1,2,...) (14) 
ka \k 

for the moments of the Poisson(A) distribution [46,42]. This formula is the 

particular case of (11) for X with Poisson(A) distribution, for it is known [46, 14] 

that 


E|X£] =a* = (k =1,2,...). (15) 


Formula (15) follows easily from (13) with f(m) = m* by change of summation 
variable from m to j =m —k. In particular, for A = 1 the factorial moments of 
the Poisson(1) distribution are identically equal to 1. So Dobifski’s formula (1) can 
be read from (14) for A = 1, which follows as indicated above from (11) and (15). 
In essence, this is Rota’s [50] proof of Dobifski’s formula cast in probabilistic 
notation. This argument differs from the proof in Section 1.2 in that it involves 
checking (15) for A = 1. 

Formula (15) has the following interpretation in terms of a Poisson process 
[33, 43]. Let 


0<Uy <0 <Uy) <1 (16) 


denote the random locations in (0,1) of the points of a homogeneous Poisson 
process on (0,1) with mean intensity measure Adu for 0 <u <1. For each 
k =1,2,... define an associated k-tuple point process, with points in (0, 1)*, to 
have a point at each of the locations (U,,,...,U,,,) a8 o ranges over the Xf 
different permutations of {1,..., X,} of length k. For distinct u; € (0, 1), indepen- 
dence properties of the basic Poisson process on (0,1) imply that the mean 


intensity of the k-tuple point process at (u,,...,u,) € (0, 1)* is 
P(some U,,, € du; foreachi <i<k) (Adu,) + (Adu,) at 
du, +: du, 7 du, **: du, =e, C7) 
so the expected number of points in the k-tuple point process is 
1 1 
E| X£\ =A‘ f du, | du, = 2*. 18 
[XE] = Ak [du J du, (18) 


Constantine and Savits [12] derive a generalization of Dobinski’s formula by 
consideration of compound nonhomogeneous Poisson processes. See also Stam 
[52] and Di Bucchianico [8] for related results. For various applications of Stirling 
numbers and their generalizations to the computation of moments of probability 
distributions, see [47,9]. Moments of the normal distribution also have interesting 
combinatorial interpretations [19,25]. More generally, the idea of representing 
combinatorially defined numbers by an infinite sum or an integral, typically with a 
probabilistic interpretation, has proved to be a very fruitful one. Other examples 
are the representation of n! as a gamma integral, which leads to Stirling’s formula 
[7, 16,38], and Laplace’s representation of kth differences of powers [35, 14, 30], 
which yields an asymptotic formula for the Stirling numbers of the second kind. 
See [41] for a recent survey of asymptotic enumeration methods. 
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3. VARIATIONS OF DOBINSKI’S FORMULA. The derivation of Dobiriski’s for- 
mula given in the previous section yields the following proposition: 


Proposition 1. Let X be a random variable with values in {0,1,2,...} and let n be a 
positive integer. The following two conditions are equivalent: 


(i) the first n factorial moments of X are identically equal to 1; 
(ii) the kth moment of X equals B, for every 1 <k <n. 


It is well known that for each A > 0 the Poisson(A) distribution is uniquely 
determined by its moments; see for instance [6, Section 30]. The Poisson(1) 
distribution is therefore the unique probability distribution whose nth moment 
equals B, for every n. But for each fixed n there are many probability distribu- 
tions on {0, 1, 2,...} that have the same first nm moments as Poisson(1). It is obvious 
that there can be at most one such distribution of X with PLX <n) = 1, because 
the moment conditions amount to a system of n linearly independent equations in 
n unknowns p,,..., p,- Less obvious is the fact that the unique solution of these 
equations is such that p, > 0 for 1 <i <n and X&”_, p, < 1, so that (p,,..., p,) is 
the restriction to {1,...,} of a unique probability distribution on {0,1,..., 1}. But 
this probability distribution on {0,1,2,..., }, whose first n factorial moments are 
identically equal to one, is known to arise in the setting of the classical matching 
problem [31,14,20,56]. If M, is the number of fixed points of a uniformly 
distributed random permutation of N,, then it is easy to show by the method of 
indicators that the first n factorial moments of M, are identically equal to 1; see 
[14]. The distribution of a random variable X with range {0,1,..., nm} is recovered 
from its factorial moments by the classical sieve formula [14] 


P(X =m) = Cyn Ele (m=0,1,...,n). (19) 


For X = M, with E[M®] = 1 for 0 < k <n, this simplifies to 


1 nom (-1)" 


s= 


= (m=0,1,...57). (20) 
1 si! 


See [20, Section IV.4] for further discussion. According to Proposition 1, the kth 
moment of M, equals B, for every 1 < k <n. That is to say, 


mine (0! 


B, = py (l<k<n). (21) 


m=1 m! 
This variation of Dobifski’s formula is derived in quite a different way by Wilf [58, 
p. 22] by substituting the classical formula 


(7 = abc (iy (22) 


into (3). As observed by Wilf, Dobifski’s formula (1) follows easily from (21) by 
letting n — %, See also Lovasz [36, 1.9] for a similar argument, James and Kerber 
[32, pp. 227-237] for connections with the representation theory of the symmetric 
group, and Diaconis and Shashahani [17] for various generalizations. In Dale and 
Skau [13] the Bell numbers appear as the factorial moments of a probability 
distribution on the non-negative integers. 
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4. THE MOMENT GENERATING FUNCTION. Consider now the moment gener- 
ating function (m.g.f.) of the Poisson(A) distribution: 


co n n 


E|exp(0X,)| - =| » oo | = y el - 


9 
nao. «7! n!} 


n 


(23) 


where the series converge for all real 6 and the interchange of FE and » is easily 
justified. See [6] for a modern treatment of m.g.f’s. From (13) with f(m) = e®” 
there is the standard formula 


70 (\—o)™ 
E[exp(@X,)] =e" ve | ) = exp(A(e’ — 1)). (24) 


This combines with (8) to yield the following double generating function of the 
Stirling numbers of the second kind. This classical formula [11, p. 206] is an 
identity between two different expressions for the m.g.f. in 6 of the Poisson(.) 
distribution: 

kgn 


Ue A"0 
1+) ys (7 7 = exp(A(e® — 1)). (25) 


In particular, for A = 1 this reduces by (3) to Bell’s [3, 4] formula 


n 


* 0 
i+ }E B,— = exp(e’ — 1), (26) 
n=1 nN. 


which gives two expressions for the m.g.f. in @ of the Poisson(1) distribution. 
Equating coefficients of A* in (25) yields the vertical generating function of the 
Stirling numbers of the second kind: 


n\ 8" 1, k 
Liha ue — 1) (k = 1,2,...). (27) 
See [11, 47,54] for alternative derivations of these identities. There are similar 
identities for many other arrays of combinatorial numbers, such as the binomial 
coefficients and Stirling numbers of the first kind [11,58], [27, p. 351], most of 
which admit probabilistic interpretations. Formulae with binomial coefficients 
typically involve independent trials, while those with Stirling numbers of the first 
kind typically involve the cycle structure of random permutations [1]. See also [2] 
for probabilistic analysis of more general combinatorial structures and further 
references. 


5, RANDOM PARTITIONS. A random partition of N,, is a random variable I] 
with values in the set P,, of partitions of N,,. The distribution of IJ then refers to 
the collection of probabilities P(II] = 7) as m7 ranges over P,,. Questions about 
enumeration of partitions of N,, of various kinds can be phrased probabilistically in 
terms of a uniform random partition, that is, a random partition II with the 
uniform distribution PUI] = 7) = 1/B, for each partition 7 € P,. For develop- 
ments of this idea see [29, 28, 51, 23]. Random partitions with non-uniform distri- 
bution also arise naturally in various contexts, so it is useful to have models for 
random partitions, both uniform and non-uniform. 

The following random allocation scheme provides a basic method of generating 
a random partition of N,. See [14,34,57] for extensive study of this and related 
schemes, and further references. Throw n balls labelled by N, into m boxes 
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labelled by N,,, and assume that all m” possible allocations of balls into boxes are 
equally likely. Let II,,, be the partition of balls by boxes. More formally, let X; be 
the number of the box containing the ith ball for 1 <i<n. Then the X; are 
independent and uniformly distributed on N,,, and II,,, is the partition of N, 
induced by the random equivalence relation i ~ j if and only if X; = X;. Formally, 
the X; can be regarded as coordinate maps defined on (N,,)”, and II,,, is then 
defined as a map from (N,)” to P.,, the set of partitions of N,. Let #(7) denote 
the number of subsets in a partition 7 € P,,. The distribution of II,,, induced by 
the uniform distribution P on N,, can be read from formula (5): 


P(I,.=7) = - if #(7) =k. (28) 


The distribution of #C(II,,,), the number of occupied boxes when n balls are 
thrown into m boxes, is given by the following probabilistic equivalent of (5): 


Pl #(11,,) =k] = ae (l<k<n). (29) 


Because the probability displayed in (28) depends on the number of occupied 
boxes k, for n >3 this random partition II of N, does not have uniform 
distribution on P,, for any m. However, as observed by Stam [53], for each fixed n 
it is possible to generate a uniformly distributed random element of P, by a 
suitable randomization of m. The following proposition was suggested by Stam’s 
construction, which is described in Corollary 3. 


Proposition 2. Let M be a random variable with values in {1,2,...}, and suppose 
given M = m that n balls labelled by N,, are thrown independently and uniformly at 
random into m boxes. Let II,,,, denote the random partition of N,, so generated. The 
following two conditions are equivalent: 


(i) II, Aas uniform distribution over the set P., of all partitions of N,; 

(ii) the distribution of M is of the form 

m"Pm 
B 


n 


P(M =m) = 


(m = 1,2,...) (30) 
for some probability distribution (p,,) on {0,1,2,...} whose first n factorial 
moments are identically equal to 1. 


Before the proof, here are two corollaries which follow immediately from the 
Proposition and the discussion in Sections 2 and 3: 


Corollary 3. [53] If M has the distribution (30) for p,, =e~'/m!, then II, y has 
uniform distribution on P,,. 


Corollary 4. For each n there is a unique distribution of M such that 
P(M <n) =1 and II, has uniform distribution on P.,,. 
This distribution of M is defined by (30) for p,, = P(M,, = m) as in (20) with M,, the 


number of fixed points of a uniform random permutation of N,,. 


206 SOME PROBABILISTIC ASPECTS OF SET PARTITIONS [March 


Proof of Proposition 2. By conditioning on M and using (28), 
ore k 


P(U,y=7)= L—P(M=m)__ it #(7) =k, (31) 
m=1 ™ 
so the distribution of II,,,, is uniform on P,, if and only if 
2° mé 1 
2» Tat (M = m) ~ 3B (1<k<n). (32) 
Define 
Pm = B,m-"P(M =m) (m = 1,2,...), (33) 
so that (32) becomes 
yy mp, =1 (l<k<n), (34) 
m=1 


which for k = 1 implies that Ly) Dn, S Lm-1™P, = 1. It follows that II, is 
uniform if and only if (p,,) derived from the distribution of M via (33) is the 
restriction to {1,2,...} of a probability distribution on {0,1,2,...} whose first n 
factorial moments are equal to 1. This is condition (ii). | 

As shown by Stam, Corollary 3 allows numerous results regarding the asymp- 
totic distribution for large n of a uniform random partition of N, to be deduced 
from corresponding results for the classical occupancy problem defined by random 
allocations of balls in boxes, for which see [34,57]. See also [2, 15, 23, 26, 28, 29, 51] 
for a more detailed account of the asymptotics of uniform random partitions of N,. 

As a variation, the following corollary is easily obtained by a similar argument: 


Corollary 5. Suppose that M has the distribution 
m"P(X, =m) 
H(A) 


where X, has the Poisson(A) distribution (12), and p,(A) = ECX;" ). Then the 
distribution of II, is given by 


P(M =m) = (m = 1,2,...), (35) 


A‘ 
P(Hny = 7) = 7 Or) if #(7) =k. (36) 


As a check, (36) implies 


A‘ 
=k] =(" <k <n). 
P[M(Caw) =k] = {tho (sk <n) (37 
The fact that these probabilities sum to 1 amounts to formula (14) for w,(A). The 
distribution of II,, defined by formula (36) defines a Gibbs distribution on 
partitions of N,,. See [55,44] for further discussion of such Gibbs distributions on 
sets of combinatorial objects. See Nijenhuis and Wilf [39] for a recursive algorithm 
to construct a uniform random partition of N,, based on the recurrence 


Ba1+D ("2") a, (38) 


where the right side counts the number of partitions 7 of N, according to the size 
k of the subset in 7 that contains n [36, Problem 1.10]. See [40] for related 
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combinatorial algorithms, and [21] for a recent systematic approach to the random 
generation of labelled combinatorial structures and further references on this 
topic. 
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Building an International Reputation: 
The Case of J. J. Sylvester (1814-1897) 


Karen Hunger Parshall and Eugene Seneta 


James Joseph Sylvester—prolific, gifted, flamboyant, egocentric, cantankerous. At 
the time of his death in London on 15 March, 1897, Sylvester’s reputation 
internationally as one of the nineteenth century’s principal mathematical figures 
had long been secure. He had worked hard to assure this. Obviously, he had done 
much seminal work in building the theory of invariants, and this had contributed to 
his renown. Yet, Sylvester had felt compelled to establish ties directly with 
mathematicians at home—but more importantly abroad—in order to make his 
name known. Was this just a matter of egocentrism, or did other factors contribute 
to his international focus? What did it take to become an internationally recog- 
nized British mathematician in the latter half of the nineteenth century, when first 
France and then Germany very much set the mathematical standard? Why was this 
even important? As the centenary of Sylvester’s death brings historians and 
mathematicians together in England to celebrate his life and research, we examine 
some of the reasons why Sylvester valued the international mathematical arena so 
highly and how he used it to his advantage during the course of his career. 

It is well-known that Sylvester, as a Jew, was, like all non-Anglicans, debarred 
by the Test Acts from taking the Cambridge degree he had earned as Second 
Wrangler in 1837 and from holding a Cambridge fellowship or professorship. His 
first position, the professorship of natural philosophy at nonsectarian University 
College London, was too far from his real interests and expertise to satisfy him, so 
he gave it up in 1841 after only three years for the uncertain fortunes of a 
professorship of mathematics in exile far from home (and, he quickly came to 
think, far from civilization!) at the University of Virginia. He lasted there for 
four-and-a-half months before resigning over a matter of principle and fleeing 
northward to New York City and his brother’s home. From there, he tried in vain 
for some eighteen months to secure a new position in the United States—at 
Columbia College, at Harvard with Benjamin Peirce, in the Washington, D.C. 
area, and even at the University of South Carolina—before returning to England 
to resume what he termed the “fruitless and hopeless struggle with an adverce [sic] 
tide of affairs” [39]. By the close of 1844, though, he proudly reported having 
“recovered [his] footing in the world’s slippery path” [39] thanks to his assumption 
of the post of actuary and secretary at the Equity and Law Life Assurance 
Company in London. During the decade from 1845 to 1855, he prepared for and 
passed the Bar; he met his mathematical alter ego, Arthur Cayley; and he 
produced his ground-breaking series of papers in what would come to be known as 
invariant theory. The next fifteen years found him in his first sustained academic 
post, the professorship of mathematics at the Royal Military Academy in Wool- 
wich, where he taught drudgerous mathematics to mostly uncaring students and 
fought with the military authorities over teaching loads destined, he was convinced, 
to bring the “extinction of my scientific existence” [40]. Sylvester’s career trajectory 
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James Joseph Sylvester, aged 26, by George Patton. In the private collection of Mr. and Mrs. Alain 
Enthoven. 


clearly diverged from those guided within the ivy-covered walls of the colleges of 
Cambridge and Oxford. 

As an establishment-outsider, who, unlike Augustus DeMorgan, did not even 
hold a position at the leading anti-establishment institution, how was Sylvester to 
secure the reputation that his healthy-sized ego demanded and that his manifest 
mathematical talents warranted? Surely, the British Association for the Advance- 
ment of Science or the Royal Society, to which he had been elected in 1839, 
represented avenues toward the establishment of national recognition, and 
Sylvester presented his work—like accounts of his research on Sturm’s theorem for 
locating the roots of an algebraic equation ({1, 1:59-60] and [1, 1:429-586])—be- 
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fore both of these organizations throughout his career. In Sylvester’s Platonistic 
view, however, mathematics was a universal endeavor that transcended national 
boundaries (see [24] for an analysis). Moreover, as Sylvester well realized, the 
Continent—and particularly France and Germany—dominated mathematical re- 
search at mid-century, while England tended to assume a more isolationist pos- 
ture. It was thus important to make one’s work known abroad. This would help 
assure that credit was given where due, that results published in British journals 
were not ignored or overlooked outside England, and that British mathematicians 
effectively contributed to building the eternal edifice of mathematics. As Marin 
Mersenne had shown in the seventeenth century, establishing an international 
network of correspondents could be remarkably helpful in achieving these goals. 

Sylvester actively sought to forge his own international mathematical connec- 
tions beginning around 1850. Initially, at least, he seemed to have the greatest 
number of ties with France, a country which had dominated mathematics and the 
sciences during the first half of the nineteenth century [14], a country with whose 
language Sylvester felt at ease, a country with an influential scientific society and 
mathematicians of the highest repute. In France, Sylvester enjoyed perhaps his 
most lasting and most intimate mathematical association with Charles Hermite. 

Eight years Sylvester’s junior, Hermite had come upon the mathematical scene 
in the 1840s with work on elliptic and hyperelliptic functions that had earned 
Jacobi’s admiration. In 1848, the year after taking his baccalauréat and licence, he 
was named répétiteur and admissions examiner at the Ecole polytechnique, where 
he himself had pursued his studies. Rather quickly, he established a reputation as 
one of the rising stars in French mathematics with his work on quadratic forms 
and, beginning in 1854, on the theory of invariants. His election to the Paris 
Académie des Sciences in 1856 gives a clear indication of his stature in the French 
scientific community. Since Hermite’s research interests fundamentally overlapped 
those of Sylvester, it is little wonder that the Englishman sought out the kindred 
French mathematical spirit. In particular, Sylvester wanted to make sure that 
Hermite knew of and gave the proper credit to his research. 


Charles Hermite, aged 25, engraved by Ch. Wittmann from Oeuvres de Charles Hermite, ed. Emile 
Picard, 3 vols. (Paris: Gauthier-Villars, 1905-1912). 
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After finishing his law studies in 1850 and following the establishment of his 
close personal friendship and mathematical exchange with Arthur Cayley, Sylvester 
finally began to come into his own as a researcher. Between 1850 and 1854, he 
published much of his work on determinants as well as some of his seminal papers 
on the emergent theory of invariants. The year 1852, in particular, witnessed the 
publication of his first major contribution to invariant theory, his paper, ““On the 
Principle of the Calculus of Forms” [1, 1:284—-327 and 328-363]. Sylvester earnestly 
desired that this work not escape the immediate notice of the French mathemati- 
cal community, and, to this end, he wrote several letters in 1852 to his correspon- 
dent, Irenée-Jules Bienaymé (compare [20]), asking him to distribute offprints to 
certain key individuals in addition to the Société philomatique de Paris, to which 
Sylvester had been elected as a corresponding member early that year, and to the 
Institut de France. 

According to Bienaymé’s handwritten tally [2] in response to Sylvester’s first 
request on 7 February, 1852 [31], Michel Chasles, Charles Hermite, Olry Terquem, 
Eugéne Catalan, Joseph Serret, and Joseph Bertrand were to receive copies of the 
first installment of Sylvester’s latest pronouncement on the calculus of forms [1, 
1:284—327] as well as an 1851 paper, “On the General Theory of Associated 
Forms” [1, 1:198-202], while Cauchy and the Institut de France were to get these 
together with a paper on the theory of determinants [1, 1:241—250]. Bienaymé 
dutifully delivered these and received additional requests from Sylvester on 4 June 
[32] and again on 27 August [33] to deliver more papers to these and selected 
others. Sylvester’s letter of 4 June revealed at least one additional motivation for 
this general distribution of his work. There, he asked Bienaymé to pass copies of 
the second installment of his paper on the calculus of forms to Hermite and to the 
Société philomatique as before, but this time he also wanted a copy to go to the 
editor of the Journal des mathématiques pures et appliquées, Joseph Liouville. As he 
explained to Bienaymé, 


I wish M. Liouville to have a copy because I am told that M. Eisenstein of 
Berlin has sent to Liouville’s Journal the same kind of matter as is in my 
Section VI [32]. 


This was the demonstration of the sufficiency of two particular differential opera- 
tors that Cayley and Sylvester employed for detecting invariants [1, 1:351-360], and 
Sylvester clearly wanted Liouville to know that he and Cayley—and not Eisenstein 
—had discovered the properties of these operators first. Eisenstein did not end up 
publishing the contested proof [11]. 

As Sylvester’s delivery list to Bienaymé also indicates, the Englishman’s associa- 
tion with Michel Chasles dates from at least the early 1850s. Chasles, who had 
established his reputation as both a geometer and an historian of mathematics as 
early as 1837 with the publication of his Apercu historique sur lorigine et le 
développement des méthodes en géométrie, had been named to the chair of higher 
geometry at the Sorbonne in 1846 and had become a full member of the Paris 
Académie des Sciences in 1851. He was thus another influential member of the 
French mathematical community and a worthy contact for Sylvester to establish, 
even though his brand of geometric research was never Sylvester’s forte. In 1852, as 
Sylvester was busy with the early invariant-theoretic work that he was anxious for 
Bienaymé to deliver to Chasles, Chasles had just published a new book, his Traité 
de géométrie supérieure [8], and asked Sylvester to serve as the messenger for two 
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Irenée-Jules Bienaymé (1796-1878) by Jules Franceschi (1825-1893). Photo by, and courtesy of, 
Arnaud Bienaymé. 


copies to British colleagues [7]. In his letter of 26 August, Chasles also provided 
Sylvester with an indication that he was starting to make a name for himself in 
France. 

Chasles reported that 


I saw M. Terquem yesterday; he spoke to me with pleasure and enthusiasm 
of the beautiful theorem that you sent him, and knowing that I was going to 
write you, he asked me to give you his compliments and to tell you that he 
was going to publish your communication without delay in his journal [7, our 
translation]. 


Olry Terquem was the editor of the Nouvelles annales de mathématiques, and the 
submission in question was Sylvester’s first publication in a foreign journal, his 
determinant-theoretic proof of the fact that if A is an n X n symmetric matrix, 
then the roots of the characteristic equation of A’ for p an integer are the 
roots of the characteristic equation of A, each raised to the pth power 
[1, 1:364-366]. Sylvester quickly followed this with two more notes to Terquem’s 
journal ({1, 1:423] and [1, 1:424-428])). Having successfully taken the step of 
bringing his work directly before the Continental audience by publishing in a 
foreign journal, Sylvester continued throughout his career to submit his work to 
periodicals both at home and abroad. 
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Besides Hermite, Bienaymé, and Chasles, Sylvester also established ties in the 
1850s in France with Jean-Victor Poncelet, Jean-Marie-Constant Duhamel, Joseph 
Serret, and Joseph Bertrand. Relative to Germany, however, his contacts were 
initially fewer and his relations never as close. One exception to this was Carl 
Borchardt, who had spent the year from 1846 to 1847 studying in Paris and forging 
his own links with Chasles, Hermite, and Liouville. Borchardt had earned his 
doctorate under Jacobi at the University of Konigsberg in 1843 and had made a 
splash in 1846 with his first mathematical publication, a work right up Sylvester’s 
mathematical alley. Borchardt had given a determinant-theoretic argument show- 
ing that the Sturm functions that arise from the equation determining the secular 
disturbances of the planets can be represented as the sum of squares [4]. In 
subsequent work, he considered related algebraic questions involving symmetric 
functions and elimination theory, that also actively engaged Sylvester. Given their 
common interests, it is littke wonder that Sylvester and Borchardt began to 
correspond. It is also littke wonder that, in 1852, Sylvester wanted to apprise 
Borchardt of his latest researches. 

On 20 February, just two weeks after he had entrusted Bienaymé with the 
bundle of offprints for distribution in Paris, Sylvester wrote Borchardt enclosing a 
copy of one of the same papers, the first installment of his paper “On the Principle 
of the Calculus of Forms.” Burdened by his ongoing preparation of Jacobi’s 
collected works and by his teaching duties as Privatdozent at the University of 
Berlin, Borchardt only responded to Sylvester’s letter on 6 April [3]. There, he 
apologized for the delay in his response and for not having had a chance to read 
and study Sylvester’s latest achievement. More interestingly, he also returned to a 
topic that had apparently been under discussion in earlier letters between the two 
men: the integrity of Otto Hesse. 

Because the early 1850s were abundantly productive years for Sylvester mathe- 
matically, it was then that he actively sought to build and solidify his increasingly 
deserved reputation. Always touchy about matters of priority, but perhaps most 
touchy during these early years, Sylvester came to feel that Hesse [18] had stolen a 
result he had published in the Philosophical Magazine in 1841 [1, 1:75-85]. 
Sylvester, furious, felt that Hesse had consciously failed to credit his work and 
lambasted him in print in [1, 1:184-197 on p. 189]. In his letter, however, 
Borchardt offered a very different read on the situation. In his measured and more 
objective view, Borchardt offered that 


lilf Mr. Hesse had known of your memoir ..., he would not have committed 
plagiarism, and I know him too well to believe him capable of it. This 
changes nothing relative to your priority but much relative to the moral 
judgment on Mr. Hesse. In mathematics it often happens that, owing to an 
insufficient knowledge of the literature, results are published as new which 
have already been obtained earlier by others. Such oversights must certainly 
be corrected, but if in every such case of this kind one claimed plagiarism, 
one would not be justified [3, our translation]. 


This dispassionate assessment of the situation as well as Sylvester’s heated reaction 
to what he deemed to be Hesse’s initial slight reflect a key aspect of reputation- 
building, the paramount importance of priority. Without it, someone else’s reputa- 
tion may grow incrementally at the expense of one’s own hard work and effort. In 
Sylvester’s case, this became even more of an issue when publications serving 
primarily different national constituencies were involved. If Hesse read the Philo- 
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sophical Magazine, a periodical that published, by and large, mathematical work of 
lesser quality than its German counterparts, then he had overlooked Sylvester’s 
paper. If he—and by extension his countrymen—did not read the Philosophical 
Magazine, then all the worse for those British mathematicians trying conscien- 
tiously to make their work known to the broader community of mathematicians 
through that means. At least in this case, the priority dispute was ultimately little 
more than a tempest in a teapot; just two years later, at a crucial juncture in 
Sylvester’s career, Hesse served as one of the Englishman’s hand-picked refer- 
ences. 

By 1854, Sylvester had been in his job as actuary and secretary at Equity Law 
and Life for some nine years. He had also been doing some of his best mathemati- 
cal research and, understandably, wanted a position more consonant with his 
interests and training. When the professorship of mathematics at the Royal 
Military Academy in Woolwich came open, then, he quite naturally took the 
opportunity to apply. When he lost out on the appointment to a mathematically 
inferior candidate and felt that the military authorities had misrepresented some 
of the facts surrounding the election, he wrote to another longtime acquaintance, 
the former Lord Chancellor, Lord Brougham, expressing his frustration. Not 
without a certain amount of pride, Sylvester let Brougham know that 


Letters were written or sent in in support of my application and couched 
in the strongest language in which a recommendation could be clothed from 


Sir William Hamilton, Dublin 
Professor Graves, D[itt]o 
Professor Kelland, Edinburgh 
Professor Challis, Cambridge 
The Bishop of Natal 

General Poncelet, Paris 

M. Chasles, Paris 

Rev[eren]d Geo Salmon, Dublin 
Duhamel, Paris 

Serret 

Hermite of the Examiners at the Polytechnique 
Bertrand 


and many others. 
Letters were also written but too late to be sent in by 


the great Lejeune Dirichlet, Berlin 
Professors Peters & Hesse, Konigsberg 
Professor Joachimsthal, Halle 
Professor Thomson, Glasgow 


and also from distinguished pupils testifying to my teaching powers [34]. 


As is evident from this roster, Sylvester made important use of the international 
network he had established, in his attempt to break into the English academic 
world. This first try proved unsuccessful, but, when the victorious candidate died 
unexpectedly after only a few months on the job, Sylvester reapplied and won the 
appointment (see [28] on the changing London mathematical scene). His distin- 
guished list of referees may not have helped him get the post, but it apparently did 
not hurt. Sylvester had finally broken back into academe after a dozen years away. 
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The late 1850s and early 1860s found Sylvester hard at work on his ongoing 
researches in invariant theory and in new, but not unrelated, work in combina- 
torics. In a series of seven lectures delivered at King’s College, London between 6 
June and 11 July, 1859 (but published only in 1897), he brought together a large 
number of known and new combinatorial results in an early effort to systematize a 
theory of partitions [1, 2:119-175]. Prior to delivering the last of these lectures, 
Sylvester wrote to his friend, Pafnuty Chebyshev, thanking him for the elucidating 
remarks on some of Euler’s work on partitions that enabled him to include a 
discussion of it in his presentation [38]. Sylvester and Chebyshev had known each 
other at least since the fall of 1852, when Chebyshev was on a foreign tour to study 
the scientific and technological advances in evidence in Paris, London, and Berlin. 
After seeing Bienaymé and others in Paris, he went on to visit Sylvester in London, 
and the two remained in touch sporadically for many years thereafter. In fact, like 
Sylvester, Chebyshev also maintained his contact with Hermite. As the Russian 
mathematician would later recollect, 


[w]e were once sitting in Paris, the three of us: Hermite, Sylvester, and I. 
Hermite—the leading mathematician of France, Sylvester—the leading math- 
ematician of England, and I [15, our translation]. 


Sylvester’s growing international network also extended to Italy by the early 
1860s to include Enrico Betti, fellow determinant-theorist Francesco Brioschi, and 
physicist and later Minister of Education, Carlo Matteucci, among others. In the 
winter of 1862, in fact, Sylvester followed one his many trips to Paris with an 
extended scientific tour of the newly unified Italy and described at least part of his 
itinerary in a letter to Cayley [36]. Here, Sylvester entered into a different role 
relative to his international network; his interest—as well as that of other foreign 
mathematicians—in the post-unification Italian scene helped to legitimize the 
research efforts of the Italians in the latter half of the nineteenth century ((5] and 
[22]). Sylvester’s presence helped the Italians establish their reputations; by the 
early 1860s, his reputation, at least in their eyes, was already secure. This was 
apparently true elsewhere as well. 

Late in 1863, Sylvester received what must have been the most significant 
symbol of his international stature as a mathematician, his election as foreign 
correspondent to the Paris Académie des Sciences. It was his contact, Joseph 
Bertrand, whom Sylvester chose to communicate one of his first ‘official’? works to 
the assembled savants, and that communication related to his (the first) proof 
of Newton’s rule for isolating pairs of complex roots of polynomial equations 
[1, 2:361—-362]. As Sylvester explained to Bertrand, 


I have proved, not without some difficulty, Newton’s rule to the fifth degree 
inclusive. Messers De Morgan and Cayley have expressed to me their firm 
conviction that Newton himself never had a proof of this rule, which remains 
to this day the marvel and the disgrace of Algebra [30, our translation]. 


Interestingly, Sylvester announced the result in print first in France, but only after 
having read the paper before the Royal Society; he later published the full 
exposition of it in the Philosophical Transactions [1, 2:376—479]. In so doing, he 
was, in a Sense, maximizing the publicity for the new result; he gave it to the most 
important scientific bodies in both his native England and in France. Moreover, 
after the founding of the London Mathematical Society in 1865, he took yet 
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another opportunity to highlight his work on the rule by lecturing at one of the 
Society’s early meetings on the proof he had subsequently discovered of the 
general case [1, 2:498-513]. 

This mathematical triumph was followed in the late 1860s through the mid 
1870s by a troubled period which found Sylvester first casting about for mathemati- 
cal direction, next in premature retirement from his post at the Military Academy, 
and then adrift and unemployed in London. All of this changed in 1876, when he 
accepted the first professorship of mathematics at The Johns Hopkins University 
in Baltimore, Maryland. He had made a bold transatlantic move as a young man of 
twenty-eight; in 1876, at the age of sixty-one, he did it again. The second time 
around the outcome was much more positive. 

Sylvester’s arrival on American shores marked the beginning of a quarter-cen- 
tury-long process of establishing mathematics at the research level in the United 
States [27]. By 1877, Sylvester had regained his research footing and had reengaged 
in his earlier invariant-theoretic researches. In particular, he sought to vindicate 
the British approach to invariant theory that he and Cayley had developed by 
providing a British-style proof of Gordan’s finiteness theorem of 1868 (on the 
history of this problem, see [9], [10], and [26]). Despite Cayley’s supposed proof in 
1856 that the number of irreducible covariants of a binary quintic form is infinite 
[6], Paul Gordan showed that this number is, in fact, finite for any given binary 
form [13]. Beginning in the late 1870s, Sylvester worked on and off to supply a 
British proof of Gordan’s theorem. Writing to William Spottiswoode on 19 Novem- 
ber, 1876, he made his nationalistic and personal motivations crystal clear in 
announcing what he thought was a proof of Gordan’s theorem. “The piratical 
Germans Clebsch and Gordan who have so unscrupulously done their best to rob 
us English of all the credit belonging to the discoveries made in the New Algebra 
will now suffer it is to be hoped the due Nemesis of their misdeeds,” Sylvester 
declared. 


Nothing in Clebsch and Gordan is really new but their Cumbrous method of 
limiting (not determining) the Invariants to any given form.... I see a 
splendid vista of investigations open to me on this subject destined I believe 
to reduce to annihilation all that the school of Clebsch and Gordan, by aid of 
methods borrowed by the Germans without acknowledgement from Cayley 
and myself, have attempted on the subject [42; Sylvester’s emphasis]. 


In Sylvester’s view, this was a priority issue of the greatest magnitude. Virtually his 
entire scientific reputation rested on his work in invariant theory, and the Ger- 
mans had not only failed to give it its due but also isolated and patched a major 
hole in the entire British approach. May the better theory win; this battle would be 
fought in the international mathematical arena. 

In 1877, Sylvester was convinced that he had not only won the battle but had in 
some sense won the invariant-theoretic war against the Germans. In May of that 
year, Hermite communicated the following announcement to the Paris Academy: 


Baltimore—Since my last communication, please inform the Academy that I 
have resolved the problem of finding the complete set of Groundforms for 


arbitrary forms in n variables” [17, p. 975, our translation]. 


This stunning announcement was rather quickly followed by a retraction, but 
Sylvester persisted in the struggle for the result. In the fall of 1878, he met his 
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adversary head on when he claimed again to have the theorem in its full generality. 
Once again, though, his claim was false. 

Sylvester continued on and off to try to find a British-style proof of Gordan’s 
theorem, and, while he never succeeded, he also never seemed to question the 
personal and nationalistic motivations behind such a quest, accepting fully the 
reality of the increasingly international arena [29] in which mathematicians com- 
peted. When David Hilbert gave the first inklings in 1888 and 1889 [19, 2:176—198] 
of the new invariant-theoretic methods he would bring forth in their fuller glory in 
1890 [19, 2:199-257], Sylvester knew he had lost a major skirmish and conceded 
defeat gracefully in a letter to Felix Klein. Hilbert “has rendered a very good 
service to Algebra, in obtaining so simple a proof of Gordan’s theorem,” Sylvester 
told Klein, “and I should like to be able to congratulate him on his brilliant 
invention. What a relief from the previous methods of proof!” [41]. International 
competition was important, but it was also important that credit be given where 
due. It is not hard to imagine, however, that Sylvester got some satisfaction from 
the fact that, in a real sense, Gordan had been bested as well! 

As a professor at The Johns Hopkins University, Sylvester did not seek to shield 
his students from the realities of international competition like his own with the 
German invariant-theoretic school. Rather, he worked to instill in them his strong 
sense of the importance of international, as opposed to merely national, exposure. 
From his own personal experience, he knew how important the foreign imprimatur 
was to the establishment of real reputation. In 1881, for example, when Sylvester 
and his Hopkins students were hard at work on what would become their 
groundbreaking paper on partition theory [1, 4:1-81], one of the students, Fabian 
Franklin, devised a wonderfully simple, graphical proof of Euler’s pentagonal 
number theorem [12]. Sylvester was so pleased by and excited about this result that 
he had Franklin write it up for communication through Hermite to the Paris 
Academy. On 29 April, 1881, Hermite wrote to Sylvester offering his own praise 
for Franklin’s proof and giving an indication of the attention that it was getting in 
France. “It certainly will not be unpleasant for you to hear that I was not the only 
person to be very interested in Mr. Franklin’s very original and ingenious proof,” 
he wrote. 


Mr. Halphen, one of our most eminent young mathematicians,... found 
Franklin’s method so remarkable that he lectured on it in one of the recent 
sessions of the Société philomatique. Please tell Mr. Franklin that his talent 
is appreciated, as it deserves to be, by the mathematicians of the old world 
[16, our translation]. 


Writing to Cayley perhaps on the very day he received Hermite’s letter, Sylvester 
could not contain his pleasure over this reaction to his student’s work. “Hermite,” 
he gushed, “is overflowing with admiration at the beauty of the method” [37]. 

Following the success of his student, Franklin, and in the wake of the concerted 
combinatorial research in which he had engaged all of his students [23], Sylvester 
began to feel the strain of leading America’s first research-level program in 
mathematics. In 1883, after seven-and-a-half years in Baltimore, he resigned from 
his Hopkins professorship to assume the Savilian Chair of Geometry at Oxford; the 
repeal of the Universities Test Act in 1871 allowed Sylvester, as a non-Anglican, to 
hold the Oxford chair. At the age of sixty-nine, he returned home, his efforts at 
reputation-building having finally yielded what for him was the ultimate prize—the 
recognition of the Oxbridge academic establishment. 
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Sylvester’s career can hardly be considered typical of mathematicians in Victo- 
rian Britain. For one thing, the fact that he was Jewish initially closed many of the 
usual avenues of a mathematical career to him. Twice this led him to leave Britain 
in the hopes of greener pastures in the United States, to broaden his focus to an 
international arena albeit the wrong one at least in the 1840s. Moreover, it forced 
him to look beyond the English academic scene for ways to establish his reputation 
in his chosen field. If he could not have the validation of a prestigious position at 
an English university, he.could at least measure his self-worth in terms of an 
international renown—hard won and carefully cultivated—that the majority of the 
mathematical practitioners within the English academic system would never enjoy. 

Another aspect of Sylvester that made him atypical of Victorian mathematicians 
was his ego. It was undeniably large. He wanted to be known for his research; he 
wanted his work appreciated; he was often the first person to pronounce his latest 
theorem “remarkable” or “beautiful.” Yet, Sylvester was much more complicated 
than this simplistic analysis would suggest. He loved mathematics and believed it to 
be eternal and transcendent. He wanted to make enduring contributions to it. As 
he explained to Lord Brougham in the aftermath of one of his altercations with the 
authorities at Woolwich, “I trust... to leave a lasting mark on ‘The Algebra of the 
Future’ ” [35]. This, too, motivated him to make connections with mathematicians 
internationally as well as nationally. They were all striving toward the same goal, 
the construction of what Camille Jordan called the “temple of Algebra” [21] or, 
more generally, the temple of mathematics. And, even if they competed like 
Sylvester and Gordan, they labored in common cause. 

Both of these atypical aspects of Sylvester as Victorian mathematician led, in his 
case, to the formation, maintenance, and utilization of an international network. 
Beginning at mid-century, national mathematical research communities— defined 
in terms of specialized professional associations, specialized journals, venues for 
the training of future researchers, and the overall emphasis on the production of 
original research—were under formation in Europe and somewhat later in the 
United States [25]. Sylvester, also beginning at mid-century, seemed to have a 
strong sense of the value of a further step in the professional development of 
mathematics, the internationalization of the field. 
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The Inverse-Similarity Problem for Real 
Orthogonal Matrices 


Geoffrey R. Goodson 


INTRODUCTION. The aim of this article is to make accessible some recent 
results ({1], [2]) in the spectral theory of unitary operators. We do this by 
investigating the special case of real unitary matrices, i.e., real orthogonal matrices. 
Our development reinforces the importance of canonical forms and matrix decom- 
positions in the undergraduate linear algebra curriculum. 

An orthogonal matrix Q (with entries from any ring) is a square matrix whose 
transpose is its inverse (Q'Q = QQ’ = /). It is known that every square matrix U 
with entries from any field is similar to its transpose, that is, UA = AU’ for some 
nonsingular matrix A; we say that such an A is a similarity between U and U’ and 
we note that A may always be chosen to be symmetric (Theorem 1 of [6]; see 3.2.3 
of [4] for the easier complex case). We are interested in the form such similarities 
A can have when U is real orthogonal, and what they tell us about the matrix U. 

Using the orthogonal canonical form of U, we show that A may be chosen to be 
a real orthogonal matrix satisfying A? = J. Our main theorems (Theorems 1 and 2) 
give necessary and sufficient conditions for every real orthogonal similarity A 
between U and its transpose to be an involution (i.e., satisfy A” = J). Finally, we 
investigate the case where there exist such similarities A that are not involutions. 
We show (Theorem 3) that the eigenvalues of U corresponding to eigenvectors of 
U in the orthogonal complement of the subspace {x: _A?x =x} have even multi- 
plicity. 

It may be that the results of this paper are well known, or follow easily from 
known results. Theorem 1, for example, follows from Theorem 2 of Taussky and 
Zassenhaus [6], which says: Every non-singular matrix transforming U into its trans- 
pose is symmetric if and only if the minimal polynomial of U is equal to its 
characteristic polynomial. A survey of results concerning the links between general 
matrices and their transposes is given in [5]. 

However, Theorem 3 may actually be new, and certainly the generalizations to 
the infinite dimensional situation, which we state in Section 2, are new (see [1] and 
[2}). Although the proofs of our theorems use matrix methods, their infinite 
dimensional analogs require the spectral theory of unitary operators. 

Throughout we shall be working with real orthogonal matrices belonging to the 
space of all m X n complex matrices M,,,,(C). Our vectors are in Y= M,,,.,(©), 


the space of m X 1 complex matrices. 


1. THE SIMILARITY BETWEEN A REAL ORTHOGONAL MATRIX AND ITS 
TRANSPOSE. Given a real orthogonal matrix U, we can construct a real orthogo- 
nal involutory similarity A between U and its transpose in the following way: 
Every real orthogonal matrix U has an orthogonal canonical form D (see [4], 
p.108), i.c., U can be written in the form U = QDQ’, where Q is a real orthogonal 
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matrix and D is a block diagonal matrix of the form 


lL 0 0 - 0 0 

0 L, 0 0 0 0 

0 0 "+, 0 0 0 
D= 0 tf 0 : 

0 0 0 +1 0 

00 0 0 = 0 #1 


and each J, = ( cosé wy for some real 0. Since every matrix of the form J, is 


—sin@ cosé 
similar to its transpose via the involution K = ; t}, we can use such matrices 


as building blocks to construct a real orthogonal matrix B that gives a similarity 
between D and D’. More explicitly, we write 


D=D,®D,®- @D,, 


where the matrices D, are either 2 X 2 matrices of the form J,, or 1 x 1 matrices 
of the form [+1]. Now put 


B=B,®B,®- ©B,, 


where B, = K if D, is of the form J,, and B, = [1] when D, = [+11]. 

Since B* = I, we see that A = QBQ’ is a real orthogonal similarity between U 
and U‘, and A* =I. 

A real orthogonal matrix is a special type of unitary matrix. A complex matrix U 
is unitary if UU* = U*U =I, where U* = U’, the conjugate transpose of U. The 
eigenvalues of a unitary matrix U are of absolute value 1. Any unitary matrix is an 
isometry on the vector space Y (with the Euclidean norm), so the same is true for 
any real orthogonal matrix. Recall that an eigenvalue of U is simple if it is a 
non-repeated root of its characteristic equation p(A) = det(U — AI) = 0. For any 
normal matrix, eigenvectors corresponding to different eigenvalues are orthogonal. 
Furthermore, if the characteristic equation has a root A repeated r times, A is said 
to be an eigenvalue of algebraic multiplicity r (So a simple eigenvalue has algebraic 
multiplicity 1). On the other hand, if for a given eigenvalue A there are r (and no 
more than r) independent eigenvectors, we say that the geometric multiplicity of A 
is r. For real orthogonal matrices (in fact, for any diagonalizable matrix, and hence 
for any normal, unitary, or real orthogonal matrix), the algebraic and geometric 
multiplicities of all eigenvalues coincide. For any real orthogonal matrix U, the 
characteristic polynomial p(A) = det (U — AI) has real coefficients, so the eigen- 
values of U are real (+1), or occur in complex conjugate pairs (necessarily of 
modulus 1). 

We now give two lemmas. The proof of the first one is a straightforward 
calculation, so is omitted. The second is a special case of a classical theorem of 
Sylvester and is the main tool in the proof of our first theorem; for the sake of 
completeness we sketch its proof in a special case (see problem 9 in (2.4) of [4]). 


Lemma 1. Let K = ; y , and K’= ({ 5): Then B is a 2 X 2 real orthogonal 
matrix if and only if there is a real 6 such that 


(a) B=I,, (b) B=KI,, or (c) B=K’'I,. 
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Lemma 2. Let F and G ben Xn and m Xm complex matrices, respectively, and 
suppose they have no eigenvalues in common. Then the matrix equation FX — XG = C, 
X ann Xm matrix, has a unique solution. When C = 0, this solution is X = 0. 


Proof: We restrict our attention to the case where C = 0. If FX =XG, then 
F*X = XG* for all k = 0,1,..., and hence by linearity, p(F)X = Xp(G) for any 
polynomial p(t). Choose p(t) to be the characteristic polynomial of F, and invoke 
the Cayley-Hamilton Theorem to obtain p(F)X = 0 = Xp(G). Since the eigenval- 


ues of F (which we denote by A,,...,A,,) are all different from the eigenvalues 
of G, p(G) =(G — A, ING — A,I)-::(G — A, J) is non-singular and hence the 
equation Xp(G) = 0 has only the solution X = 0. a 


Theorem 1. Let U and A be real orthogonal matrices and suppose that 
UA = AU’. (1) 
If all the eigenvalues of U are simple, then A? = I. 


Proof: The idea is to show that equation (1) forces A to have real eigenvalues, 
which can only be +1, so A* = I. To do this we use an orthogonal canonical form 
of U to write U = QDQ’, where Q is real orthogonal and D is block diagonal: 


D=D,®D,®- ®D,, 


for some integer k, and every D,, i= 1,...,k, is either 1 X 1 Gn which case 
D, =[+1)]) or 2 X 2 with non-real eigenvalues (in which case it is a 2 X 2 real 
orthogonal matrix of the form J,, whose eigenvalues are e*'® with 6 # ma for 
any m € Z). 

Substituting U = QDQ’ into equation (1) gives 


DB = BD" (2) 


where B = Q’AOQ is a real orthogonal matrix with the same eigenvalues as A. 
Equation (2) gives us information about the block structure of B. 

Partition B = [B,,] conformally with D. Then equation (2) is equivalent to the 
block matrix equations 


D,B;; — B,;D; = 0; L,j= 1,...,k. (3) 


Since all the eigenvalues of U, and hence also of D, are simple, the eigenvalues 
of D; and D, are disjoint for all i #7. By Lemma 2, B;; = 0 for all i #j. This 
implies that B is a block diagonal real orthogonal matrix whose diagonal blocks 
are either 1 X 1 (in which case they are [+1]), or 2 X 2 Gin which case Lemma 1 
ensures that they are of the form I,, KI,, or K'I,). 

If B,, is 1 xX 1, then clearly BZ =I, so consider the case where B,, is 2 x 2. If 
we are given two 2 X 2 real orthogonal matrices D; = I, and B,, = I, (think of 
them as plane rotations), then J,,, =J,_,, and so I,J, =I,J_, implies that 
06+ o9=go-—6+2m7, and hence 6=mm7. This contradiction shows that B,, 
cannot be of the form J,. Consequently each B;; must be of the form KI, or K’I, 
(from Lemma 1). But 


(KI,) = KI, KI, = K?I_,I, =K? =I, 


and the same computation shows that (K’I,)° = I. We conclude that B* = I, and 
hence A? = J. = 
We now consider a partial converse to Theorem 1. 
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Theorem 2. Let U be a real orthogonal matrix. If the only real orthogonal similarities 
between U and U’ are involutions (i.e., UA = AU? and AA? =I implies A? = I), 
then all the eigenvalues of U are simple. 


Proof: We show that if U has any non-simple eigenvalues, then there are real 
orthogonal matrices A for which UA = A‘U and A’ # J. There are essentially two 
cases to consider. 


Case (i). Suppose that the eigenvalue A = 1 has multiplicity at least two (a similar 
argument works for A = —1). If D is an orthogonal canonical form of U, we can 
decompose D as D = D, ® D, @ --- ® D,, where D; = [1] and D, = [1] for some 
j #4. Since the D,’s may be rearranged in any order, we may suppose that 
Z=j+1=k, so that D=D,0-:-®D,_, 6D’, where D'=D,_, © D, is a 
2 X 2 identity matrix. Now define B = B, © B, © --: © B’, where for i = ],..., 
k—2; B,=K if D,=J,, B, = (1) if D,; = [+1], and 


,_( 01 
B ° 5): 
It is clear that DB = BD’, but B? #1 since B’’ # I. Setting A = Q7BQ, we 
have UA = AU’, but A? £1. 


Case (ii). Suppose that U has no non-simple real eigenvalues, and that the 
eigenvalue e’° (with 6 # n7) has multiplicity at least two. As in case (i), we may 
assume that the last two blocks in the canonical form of D are the same, and 


_ cos@ sind\ _ 
Dea = [ cos? nt De: 
(The other possibility is that D,_, =I, = Dj, and this can be treated similarly). 
As in case (i), construct a block matrix B, except that corresponding to D’ = D,_, 


® D,, put B' = J, where J = (_x 5): Again we see that D'J = JD" and J? #1, 


giving the desired conclusion. = 


The next result gives additional information about the nature of the eigenvalues 
of real orthogonal matrices. As in the proof of Theorem 1, we shall see that the 
eigenvalues of the matrix A play an important role. 

The crucial observation is that if A is an eigenvalue of U corresponding to some 
eigenvector x, then Ax is also an eigenvector for U (where x is the complex 
conjugate of the vector x), corresponding to the same eigenvalue A, and con- 
versely, 1.e., 


Ux = 4x @ UAK = AAX. (4) 


This is a consequence of the fact that |A| = 1 so that 1/A = A, and if A is an 
eigenvalue of any nonsingular matrix B corresponding to an eigenvector x, then 
1/A is an eigenvalue of B~' corresponding to the same eigenvector x, together 
with the equalities: 


UAx = UAx = AU~!x = XAx = AAX, 


The idea of the proof of Theorem 3 is to show that if x is an eigenvector of U in 
the orthogonal complement of the subspace {y € #%: A*y = y}, then x and Ax are 
orthogonal, and hence must be independent eigenvectors of U corresponding to 
the same eigenvalue. 
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Theorem 3. Let U and A be given real orthogonal matrices. If UA = AU’ and 
A* #1, then every eigenvalue of U with a corresponding eigenvector in the orthogonal 
complement of H = {x © Y: A*x = x}, has even multiplicity. 


Proof: We start by showing that it suffices to prove the theorem in the special 
cases where U = +I, or U= I, ® I, @-- @ I, (m copies), and I, =I1,, for some 
real number 06. 

Let D=D, @D,®-:- ® D, be an orthogonal canonical form of U, so U = 
ODQ’. Then DB = BD’, where B = Q7AQ. 

Either D; = 1, for some real number 6,, or D,; =[+1]. We can write D = 
EE, ®@E, 8: ® E,,, where F; consists of the direct sum of all copies of those D, 
having the same eigenvalues, ie. EF; =1,01,®-:- @1, for some real 6, or 
E, = +I (possibly different identity matrices). We do this to ensure that the 
spectra of FE; and £, are disjoint for i # j; i,j = 1,...,m. 

If we partition B = [B,;] conformally with D = E, @ E, ®--- E,,, then 

E;B;; = BE’; i,j =1,...,m. 


oj? 


Lemma 2 ensures that B;, = 0 for i # j. This gives a block diagonal representation 
for B, and it suffices to consider each equation E,B,, = B,,E/ separately. Conse- 


quently we have essentially two cases to consider as the other cases can be treated 
in a similar manner. 


Case (i). U=1,®1,®-:: ® I, (m copies), for some real 6 # nz. Since UA = 
AU", both H and H* are U and A-invariant subspaces of %. 

We define new subspaces #, and FY, of Z by: F, = linear span of those 
eigenvectors of A? corresponding to eigenvalues A € S*, where S*= upper half 
of the unit circle in the complex plane (not including +1). Similarly we define F, 
in terms of S~ = lower half of the unit circle (again not including +1). Clearly F, 
and #, are U-invariant, for if x € A, is an eigenvector of A’, say A*x = Ax, then 
A € §*, and 


A’Ux = UA*x = UAx = AUx, 


so that Ux © F,. 
If H_, ={x €Y: A*x = —x}, then it can be seen that 


V=H OF, ®P,0H_, 


is a direct sum of U-invariant subspaces. Furthermore, if x © Y,, then Ax €F, 
(since if A?x = Ax, where A € S*, then A?(Ax) = A AX, where A € S~). 

Suppose that {x,, X2,...,X,} cP is an orthonormal set of eigenvectors of U 
corresponding to the eigenvalue e’°, Then equation (4) implies that the set 


{ At, At,,..., At, |} CP, 


is also an orthonormal set of eigenvectors corresponding to e’®. We conclude that 
there are an even number of pairwise orthogonal eigenvectors in A, ® F, corre- 
sponding to e’°®. 

In a similar manner we show that the multiplicity of e’® on the subspace H_, is 
even. This is slightly more complicated because if x € H_, is an eigenvector of U 
corresponding to e’®, then Ax is another eigenvector of U in H_, corresponding 
to e’®, However, we can use the fact that A*x = —x, x € H_, to show that Ax and 
x are orthogonal. 
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Denote the scalar product of x,y © Y by (x, y) (So (x, y) = x*y), and recall 
that the isometry A preserves this scalar product. Therefore if x € H_,, 
(x, Ax) _ (Ax, A*x) _ ( Ax, —Xx) a — (x, Ax) — —(x, Ax), 
and hence (x, Ax) = 0. Given an orthonormal set of eigenvectors in H_, corre- 
sponding to e’®, we can use this idea to construct a set of orthonormal eigenvectors 
of the form {x,, AX,, x,, AX,,..., x,, Axt,}, that spans the same subspace. This is 
done by first choosing x,, and hence Ax, as orthonormal eigenvectors. Now use 
the Gram-Schmidt orthonormalization process to find an eigenvector x, orthogo- 
nal to both x, and Ax, and then check that Ax, must also be orthogonal to x, 
and Ax,. Continuing in this way gives a basis of eigenvectors having an even 
number of members. 


Case (ii). If U =J, an argument similar to the one in case (i) gives the desired 
result. | 


Examples. It is interesting to notice that Theorem 3 holds even when U = J, the 
identity matrix. In fact, suppose that U and A are orthogonal matrices satisfying 
the conditions of Theorem 3. If they are 2 x 2 matrices, and if U has non-simple 
eigenvalues, then U = +I. If they are both 3 x 3 matrices, then U = [+1] © [£1] 
® [+1], (8 possibilities). It is not too difficult to list the possibilites for (the 
orthogonal canonical form of) U, in the 4 x 4 case. Here are some possibilities: 


0 1 OO 0 
1. Let U= ~O } 5 , , so U has +i as double eigenvalues. Consider 
0 0 -1 0 
0 0 -I1 1 
gue {® oOo 1 1 
v2 \1 1 0 O}’ 
1 -1 0 0 


for which UA = AU’, but A? € 1. 

2. Here is a general method for constructing orthogonal matrices U and A 

satisfying the conditions of Theorem 3, together with A” #J foralln € Zn #0. 
First start by choosing U, to be any orthogonal matrix with the property that 

U; #I for all n € Z, n # 0; for example take 


cosm? = sina? 
uy {_cosmy sine), 
—sina“ cosa 
Now define U on 7 ® 7 by 
U = Up ® US. 


Define the linear operator A: YOY -YOeY by A(xt+y)=y+ Ux, and 
check that UA = AU’ and A” #7 for all n € Z, n # 0. From the construction it 
is clear that U has eigenvalues of even multiplicity. 


2. INFINITE DIMENSIONAL GENERALIZATIONS. The results in this section 
were motivated by questions arising in ergodic theory, the study of the dynamical 
and spectral properties of measure—preserving transformations. 

Let U: H ~ H be a unitary operator defined on a separable Hilbert space H, 
i.c., U is an isomorphism of H onto H. Measure—preserving transformations give 
rise to unitary operators in the following way. 
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We assume that (X,.¥, w) is a measure space that is isomorphic to the unit 
interval with its Borel measurable sets and Lebesgue measure. A transformation 
T: X > X is said to be measure preserving if T~'B € F, and p(T~'B) = p(B) 
whenever B € ¥. If both T and T™' are measure preserving, we call T an 
automorphism of (X,.¥, w). If we define an operator 


U,: L?(X, w) 2 L*(X,h) by U;-f(x) =f(Tx); «Ex; fEeLl’(X, py), 


then it can be shown that U; is a unitary operator (to see it is an isometry, check 
first for characteristic functions of measurable sets and then extend using linearity 
to simple functions, etc.). 

A central notion in ergodic theory is the idea of ergodicity of a transformation, 
which says, roughly speaking, that the space X is indecomposable under the action 
of T. More precisely, T: X — X is ergodic if T~'A = A, A © ¥ implies uA) = 0 
or 1. 

An important open question of ergodic theory is the nature of the spectrum of 
ergodic automorphisms. The first major result involving the spectra of ergodic 
transformations was the discrete spectrum theorem of Halmos and von Neumann 
([7], page 68). T is said to have discrete spectrum if its eigenfunctions (i.e., those 
f € L’(X, pw) for which f(Tx) = Af(x) for some |A| = 1) form a complete orthonor- 
mal basis of L’*(X,). The discrete spectrum theorem says that two ergodic 
transformations T, and T, with discrete spectrum are isomorphic (i.e., there is an 
automorphism § satisfying T,S = ST.) if and only if they have the same eigenval- 
ues. It is immediate from this theorem that if T is ergodic and has discrete 
spectrum, then T is isomorphic to its inverse T7'. 

A starting point of our work in this area was the realization that for an ergodic 
automorphism with discrete spectrum, every isomorphism § between T and T~' is 
an involution. It is important to note that unitary equivalence between U;, and U;, 
does not in general imply that T, and T, are isomorphic. The discrete spectrum 
theorem says that this is true when 7, and 7, are ergodic with discrete spectrum. 

There are few general results known concerning the spectra of ergodic automor- 
phisms. Theorems 4 and 5 resulted from a desire to obtain more such general 
information, and in particular to obtain new criteria for ergodic automorphisms to 
have a non-simple spectrum. 

In the infinite dimensional case it is possible that a unitary operator has no 
eigenvalues (it is said to have continuous spectrum). The infinite dimensional 
analog of simple eigenvalues is the idea of simple spectrum. A unitary U: H ~ H 
is said to have simple spectrum if there exists h € H with Z(h) = H, where Z(h) is 
the closed linear span of the subset {U"h: n € Z}. 

The following is the infinite dimensional analog of Theorem 1 (see [1)). 


Theorem 4, Let U: L*CX, w) > L*CX, w) be a unitary operator that preserves real 
valued functions and has simple spectrum. If A is unitary, preserves real valued 
functions, and satisfies UA = AU™', then A? = I. 

There are very large classes of examples that satisfy the conditions of this 
theorem and some examples resulting from ergodic theory are given in [1]. Note 
that it can be shown that if U,; has simple spectrum, then T is ergodic. This is 
because 1 is always an eigenvalue for U;, and must be a simple eigenvalue if U; 
has simple spectrum, i.e., the only eigenfunctions are constants. It is now an 
exercise to show that ergodicity is equivalent to the fact that the only invariant 
functions for T (i.e., f(Tx) = f(x) a.e.) are constants. We should mention that the 
infinite dimensional analog of Theorem 2 1s also true. 
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The infinite dimensional analog of Theorem 3 gives additional information. This 
result was first formulated and proved in [2]. For a unitary operator U: H — H on 
a separable Hilbert space H, U is completely determined up to unitary equiva- 
lence by a measure o defined on the unit circle S', called the maximal spectral type 
of U, and a function p: S' > Z* U{}, called the multiplicity function. The essential 
values of this function are the values it takes almost everywhere with respect to a 
(see [3]). We consider © as an even number. 


Theorem 5. Suppose that U, A: L’(X, w) > L’?CX, w) are unitary operators that 
preserve real valued functions and satisfy UA = AU~'. Then in the orthogonal comple- 
ment of the subspace 


{fe L?(X, w): A(f) =f}, 


the essential values of the multiplicity function of U are even. 
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Rethinking Rigor in Calculus: The Role 
of the Mean Value Theorem 


Thomas W. Tucker 


1. INTRODUCTION. Mathematicians have been struggling with the theoretical 
foundations of the calculus ever since its inception. Bishop Berkeley’s attack on 
Newton’s “ghosts of departed quantities,” Euler’s claim that1-—1+1-—-1:: = 
1/2, Cauchy’s « — 6 definition of limit, all are part of the fascinating history of 
this struggle (see [7]). Calculus instructors and textbooks face the same struggle, 
but the tack taken, although formal, is often not sensible or honest. Instead of an 
admission that Newton, Leibnitz, the Bernoullis, and Euler all managed quite well 
without any rigorous foundations, instead of the story how a rigorous calculus took 
mathematicians two hundred years to get right, the Mean Value Theorem is 
waved, like a cross in front of a vampire, to hold the difficulties at bay. The origin 
of the Mean Value Theorem in the structure of the real numbers is not addressed; 
that is much too difficult for a standard course. Maybe it is traced back to the 
Extreme Value Theorem, but the trail ends there. The result is that a technical 
existence theorem is introduced without proof and used to prove intuitively 
obvious statements, such as “if your speedometer reads zero, you are not going 
anywhere” (if f’ = 0 on an interval, then f is constant on that interval). That’s the 
sort of thing that gives mathematics a bad name: assuming the nonobvious to prove 
the obvious. And by the way, there is nothing obvious about the Mean Value 
Theorem without the hypothesis of continuity of the derivative. Cauchy himself 
was never able to prove it in that form. 

I have serious reservations about the need for formal theorems and proofs in a 
standard calculus course. On the other hand, for those mathematicians who do feel 
that need, I have a suggestion for an alternative theoretical cornerstone to replace 
the Mean Value Theorem (MVT); I hope textbook authors adopt it. It is much 
easier to state, much more intuitively obvious, and much more powerful than most 
mathematicians realize. It is simply this: 


The Increasing Function Theorem (IFT). Jf f’ > 0 on an interval, then f is increas- 
ing on that interval. 


Here, increasing means that if c < d, then f(c) < f(d). This would usually be 
called nondecreasing, but that term is awkward; for example, nondecreasing and 
not decreasing mean different things. It seems to make more sense to use the term 
strictly increasing for the condition that if c < d, then f(c) < f(d). A function that 
is increasing, but not strictly increasing, we call weakly increasing. 

Most of the rest of this paper is concerned with the consequences of the IFT, 
treating it as an axiom. I will give, however, a short independent proof of the IFT, 
for the sake of completeness and for readers who have probably never thought of 
proving the IFT directly without the MVT. Of course, the IFT follows easily from 
the MVT. In fact, the contrapositive of the IFT is a weak form of the MVT: if 
a <b and f(b) < f(a), there is a number c, a < c < b, such that f’(c) < 0. 
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It is impossible to be a pioneer in territory as well-trodden as the Mean Value 
Theorem. Others have championed calculus without the Mean Value Theorem 
(see [1], [4], [6]). The first two sections of this paper follow Lax, Burstein, and Lax 
[9] quite closely, although unintentionally. In fact, after searching through dozens 
of calculus books for the Taylor remainder proof given in this paper and finally 
finding it in Lax-Burstein-Lax (LBL), I felt a little uncomfortable. Maybe this 
paper shouldn’t be published and all that is needed is an announcement “Go read 
LBL.” Then I read Grabiner [7] and found that the Taylor remainder proof given 
here and in LBL is actually Lagrange’s original proof. I was surprised that such a 
simple, direct proof could have been covered over by years of second-growth 
jungle. 

Moreover, the idea of Lagrange’s proof keeps being rediscovered for special 
cases like sinx or cosx. For example, the Monthly published such an article 
recently [2], which then generated a subsequent Editor’s Note [2] citing calculus 
textbooks and Monthly articles where the idea of [2] had already been presented. 
None of these references noted that the same idea works for all functions; LBL is 
still the only book that does that, to my knowledge. And hardly anyone seems to 
know the idea is really Lagrange’s! Under these circumstances, it appears that 
some dissemination is badly needed to clear up a memory lapse of generations of 
mathematicians. It also appears that previous calls ((4], [6]) to downplay the Mean 
Value Theorem have fallen on deaf ears. Perhaps the recent debates about 
calculus instruction have unplugged some ears and it is time to try the call again. 


2. A PROOF OF THE INCREASING FUNCTION THEOREM. There is a reason- 
ably elementary proof of the IFT that depends only on the nested interval property 
of the reals: if a, <a,,, <b,,, <6, for all n = 1 and lim, (b, — a,) = 0, then 
there is a number c such that lim, ,.a, = lim,_,..b, =c. The proof of the IFT 
given here does not require the continuity of f’ and is so self-contained that it 
probably could be given in a standard calculus course. Although I generated this 
proof in response to some remarks of Peter Lax, I should have known the proof is 
too natural to be original. In revising this paper, I discovered Richmond’s article 
[10], which contains essentially the same proof, and as I already knew, Ampére and 
Cauchy used the key observation in their own proofs. 


nwo 


Proof of the IFT. The proof depends on the following simple 


Observation. Given a function f, define slope(a,b) to be the usual quotient 
(f(b) — f(a))/(b — a). If slope(a, b) = m and c is between a and 5, then one of 
slope(a, c) and slope(c, b) is greater than or equal to m and one is less than or 
equal to m. For a proof, draw the obvious picture. 

Suppose now that f’(x) = 0 on [a, b] and that f is not increasing; that is, for 
some a,,b, with a <a, <b, <b, we have f(a,) > f(b,). Let m = slope(a,, b,). 
Note that m < 0. By repeated bisection and our observation, we can find a nested 
sequence of intervals [a,, b,] with slope(a,, b,) < m and lim, .,.(b, — a,,) = 0. Let 
c=lim,...a, = lim,.,.b, (the possibility c =a or c =b causes no difficulty). 
Since f’(c) => 0 and m < 0, for all x sufficiently near c, slope(x, c) > m. Thus for 
all large enough n, slope(a,,c) > m and slope(c, b,) > m, which contradicts our 
observation and the fact that, by construction, slope(a,,b,) <m. If a, =c or 
b, = c, the contradiction is immediate. | 

As we have observed, the contrapositive of the IFT is an existence statement 


that if f is not increasing on the interval [a, b], there exists a number c between a 
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and b where f'(c) < 0. The preceding proof is constructive, in that once one finds 
a, < b, with f(a,) > f(b,), the bisection procedure effectively computes a number 
c such that f’(c) < 0. 


3. IMMEDIATE CONSEQUENCES OF THE IFT. We first consider some conse- 
quences and variations of the IFT. 


Theorem 1. The following statements are consequences of the IFT. Assume f is 
differentiable on [a, b| anda < b. 


a) If f'(x) < 0 on the interval [a, b], then f is decreasing on the interval (a, b]. 

b) If f(x) = 0 on the interval (a, b], then f is constant on the interval (a, b]. 

c) If f'(x) > 0 on the interval (a, b], then f is strictly increasing on the interval 
La, b]. 

d) If f'(x) < g'(x) on the interval [a, b], then f(x) — f(a) < g(x) — g(a) for all x 
in [a, b}. 

e) If m <f'(x) <M on the interval [a,b], then m(x — a) < f(x) - f(a) < 
M(x — a) for all x in (a, b]. 


Proof: 


? 


(a) Multiplication by —1 reverses inequalities and interchanges ‘increasing’ 
and “decreasing”. 

(b) By the IFT and (a), it follows that f is both (weakly) increasing and (weakly) 
decreasing on [a, b]. That means f is constant. 

(c) By the IFT, f is increasing. Suppose that a<c <d<b and f(c) = f(d). 
Since f is increasing on [c,d] we must have f(x) = f(c) = f(d) on [c, d]. 
That is, f is constant on [c,d]. Therefore f’(x) = 0 on [c, d], contradicting 
f'(x) > 0 on [a, b]. 

(d) Apply the IFT to h(x) = g(x) — f(x) to conclude g(a) — f(a) < g(x) — f(x). 

(e) Apply (d) to f(x) and Mx to get the right inequality and to mx and f(x) to 
get the left equality. 


Theorem ic could be called the Strictly Increasing Function Theorem (SIFT). 
Lax-Burstein-Lax [9] calls it the Criterion for Montonicity. There the IFT is 
derived directly from the SIFT by looking at f(x) + mx = g(x), for all positive 
slopes m. If f’(x) = 0, then g’(x) > 0, so by the SIFT g is strictly increasing. Thus 
if x > a, then f(a) + ma < f(x) + mx. Since this inequality holds for all m > 0, it 
follows that f(a) < f(x), that is, f is increasing. I feel, however, that this proof is a 
little tricky. Although the idea of perturbing a function is important throughout 
analysis, it comes out of the blue for a first-year calculus student. I prefer the IFT 
over the SIFT as a theoretical cornerstone. First, our proof that the IFT implies 
the SIFT is easier and more natural than a proof that the SIFT implies the IFT. 
More importantly, Theorem 1c, which could be called the Constant Function 
Theorem, follows immediately from the IFT; the only way the SIFT can get this 
fundamental result is via the IFT. By the way, I view the Constant Function 
Theorem as even more basic than the IFT. It would be nice to use it as our 
theoretical cornerstone, but I know of no way to use it to get the IFT. 

Theorem 1d is called the Racetrack Principle by Jerry Uhl: if one car goes faster 
than another, it travels farther during any time interval. It is used as a theoretical 
cornerstone in the text [5]. 
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Theorem le is perhaps the most important, especially from a historical view- 
point. If the inequalities are rewritten: 


m < LOI . 


X—Aa 


M 


we have the Mean Value Inequality. The Mean Value Theorem follows immedi- 
ately if we know that f’ is continuous and that the Intermediate Value Theorem 
holds. That is exactly what Cauchy did [7]: he proved the Mean Value Inequality 
and assumed the continuity of f’ and the Intermediate Value Theorem. His 
assumption of continuity should not be surprising since his proof of the Mean 
Value Inequality also assumes that the difference quotient (f(x + h) — f(x))/h 
approaches f’(x) uniformly as h approaches 0. Peter Lax has argued that, for the 
theoretical foundations of an introductory calculus course, one should always avoid 
pathology and assume uniform continuity and uniform convergence, just as Cauchy 
did. It is interesting to note that before Cauchy, Ampére [7] saw the importance of 
the Mean Value Inequality and even used it as the defining property of the 
derivative. One could argue in a similar vein that the Mean Value Theorem should 
be the defining property of the derivative; Andrew Gleason has told me that a 
calculus textbook by Donald Richmond around 1960 did exactly that, but I have 
been unable to find the book. 

Finally, I should comment on the hypothesis of differentiability at the end- 
points, both in the IFT and in Theorem 1. All one need assume is continuity at the 
endpoints, just as in the MVT. Simply observe in the proof of the IFT that the 
initial points a, and b, can be chosen so that a < a, < b, < b, since if f(a) > f(b) 
then by continuity f(a,) > f(b) for a, > a near enough a, and f(a,) > f(b,) for 
b, < b near enough b. 


4. ERROR BOUNDS AND ERROR BEHAVIOR FOR TAYLOR POLYNOMIALS. 
If Theorem le is rewritten 


f(a) +m(x—a) <f(x) <f(a) + M(x —- a), 


we see a glimmering of an error bound for Taylor polynomials. The proof we are 
about to give is almost too transparent and simple to believe: just antidifferentiate 
repeatedly the inequality f*t”(x) < M. Not only does the proof give the La- 
grange form of the error bound, it also creates the Taylor polynomial itself. 
Moreover, as we have observed, it is Lagrange’s original proof and can be found in 
LBL [9]. It is also the proof I wrote for the textbook of the Calculus Consortium 
Based at Harvard [8]. On the other hand, I have so far been unable to find it 
anywhere else. All the other proofs I know involve applications of Rolle’s Theorem 
to rather elaborate auxiliary functions or repeated integration by parts or clever 
tricks with varying parameters. None are natural and none are likely to be 
discovered or appreciated by an average calculus student. 


Theorem 2. (Taylor Error Bound). Suppose that m < f"*”(x) < M on the interval 
[a, b], where f denotes the ith derivative of f. Then on [a, b] 


(x —a)"*! (x —a)"" 
™ n+ 1yt < f(x) — T(x) <M yl” 


where T,(x) is the degree n Taylor polynomial for f centered at x = a. 
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Proof: To get the upper bound, we apply Theorem 1d (the Racetrack Principle) to 
f(x) and Mx (since f"*' <M), which gives 


f(x) — f(a) < M(x - a). 


Applying the Racetrack Principle again, we get 


fe-D(x) —f"- D(a) — f™(a)(x — a) < ye 


and again 


2 3 
x-a x—a 
f(x) = fOB(a) — FO-(a)(x = a) = F(a) AS gw 
Applying the Racetrack Principle a total of n + 1 times gives the upper bound. 


The lower bound is obtained the same way. | 


Theorem 2 gives error bounds only for x = a. To get similar bounds for x < a, 
we observe that if f is increasing and x <a, then f(x) < f(a), rather than 
f(a) < f(x). Thus for x < a, each application of Theorem 1d reverses the inequali- 
ties, but since Theorem 2 sandwiches the error for x > a, reversing inequalities 
will simply sandwich the error again for x < a (although which bound is the upper 
one depends on whether n is odd or even). The usual two-sided error bound 
involving absolute values then follows immediately. 

It is possible for students to discover Theorem 2 for themselves. Consider the 
following problem. A particle is traveling along the x-axis with position x = f(t) 
and suppose the initial position, velocity, and acceleration are all 0. If f"(t) < 5 
for t => 0, find an upper bound on the position at time t = 2. Since students are 
well-trained to antidifferentiate acceleration to get velocity and velocity to get 
position, it is not unnatural to see them argue as follows: 


f"(t) < 5 
a=f"(t) <5t+c,, and here c, = Osince f"(0) = 0 
12 
v=f'(t) < 55 +c,, and here c, = Osince f’(0) = 0 


3 
s=f(t) < brs +c,;, and here c, = Osince f(0) = 0. 


Thus, we get f(2) < 5-2°/6 = 20/3. This is a legitimate argument as long as one 
can justify antidifferentiating inequalities in the same way as equalities. That is 
exactly the point of the Racetrack Principle! 

Acceleration and velocity are not a bad way of introducing Taylor series. The 
usual formula students memorize from physics, 


| 2 
5 = Sq + Ugt + Fat’, 


is precisely the degree 2 Taylor polynomial for s(t) when the constant acceleration 
a is interpreted as the acceleration at time 0. This fact seems worth exploiting, but 
I don’t know any textbook that makes the connection. 

Taylor’s theorem is usually presented as a method of bounding the error in 
approximating a function by its degree m Taylor polynomial. This viewpoint is 
particularly appropriate in studying the error for fixed x as n — %™, as in the proof 
of the convergence for all values of x for the Taylor series for e* or sinx. 
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Nevertheless, I believe that this viewpoint is overemphasized and that the true 
power of Taylor series is in explaining error (or convergence) behavior for fixed n 
as x — a. Why is Simpson’s Rule so much better than the Trapezoid Rule? What 
makes the approximation sin x = x so good? For numerical behavior, the impor- 
tant thing to know is the order of convergence for fixed n under normal circum- 
stances and what situations might affect that order of convergence. The real point 
of Taylor’s theorem is that the error is order n + 1 in (x — a) with a constant 
depending on the (n + 1) derivative. 

To be more precise, we say E(h) is asymptotic to Ch”, denoted E(h) ~ Ch", if 
lim, »E(A)/h” = C. Also, we say E(h) is order n with bound M if 
lim sup|E(h)/h”}| < M. Then Taylor’s theorem can be viewed this way: 


Corollary. Let E(h) be the error f(x) — T,(x) where T(x) is the nth degree Taylor 
polynomial for f at x = a and where h =x — a. If f"*” is continuous at x = a, then 
Eth) ~ f(a) h"**/n + Dt. Tf (f(x) < M in a neighborhood of x = a, 
then E(h) is ordern + 1 with bound M/(n + 1)!. 


5. ERROR BEHAVIOR FOR NUMERICAL INTEGRATION. Another application 
of the Mean Value Theorem is to explain the error behavior for various common 
numerical integration rules: Left Rule, Right Rule, Trapezoid Rule, Midpoint 
Rule, Simpson’s Rule. This behavior is best described using Taylor series in Ax for 
the error. Numerical analysis texts sometimes do this, but calculus texts don’t. 
Since this approach is not so well-known, I'll give a version. 

The idea is to concentrate on one panel of the subdivided area. Without loss of 
generality, we can assume the panel is centered at the origin. Thus we wish to 
compute 


I(h) = mic dx, where h = Ax/2. 


The estimate for this single panel by the left-rectangle rule is 


I(h) = L(h) = 2h( f(—h)). 
The other estimates are given by 
Left: L(h) = 2hf(—h) 
Right: R(h) = 2hf(h) 
Midpoint: M(h) = 2hf(0) 
Trapezoid: T(h) = (L(h) + R(h))/2 
Simpson: S(h) = (2M(h) + T(h))/3 
The formula relating Simpson’s Rule to the midpoint and trapezoidal rules is 
not as well known as it should be.-Students can be led to guess the weighted mean 
as a better estimate, if they spend a little time looking at the error behavior of the 
midpoint and trapezoidal rules. 
We want to compute the Taylor series centered at a = 0 for all these functions. 
For the rules, this is simply a matter of replacing f(h) or f(—A) by the Taylor 
series for f centered at a = 0. For J(h), we observe that by the Fundamental 


Theorem of Calculus, I'(h) = f(h) + f(—h). Thus I’(h) = f'(h) — f'(—A), I'"’'(A) 
= f"(h) + f"(—h), ete. 
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The Taylor series for [(h) is therefore 
h? h? 
I(h) = 2f(O)h + 2f"(O) = + 2F"O) to 


The series for the rules are 


(—h)’ 
21 


+ eee 


L(h) = 2H} 10) + f'(0)(—A) + f"(0) 


h? 
R(A) = 2/0) + f'(O)h + f"(O)> + - 
M(h) = 2a f(0)] 


h? hé 
T(h) = 2] 0) +£O> +f" OG t+ 


h? h* 
S(h) = 2 0) + F(O)— +F"O = 


+ eee 


The error behavior for each rule is obtained by subtracting the Taylor series for 
I(h) from the Taylor series for the rule and looking for the first term that doesn’t 
cancel. The errors behave asymptotically as follows: 


Left Error ~ —2f'(0)h? 


Right Error ~ 2f'(0)h? 
h? 
Midpoint Error ~ —2f"(0) 31 


1 1 h? 
Trapezoid Error ~ 21"(0)(5 — Aa = 2f"(0) z 
Si E OFf'""'(0 1 } h° OFM" (0 h° 
impson Error ~ 2f Ol eet =| = 2f (0) =: 
The error behavior of these rules for the entire interval is obtained by multiplying 
by the number n of subdivisions and replacing h by Ax/2 where Ax = (b — a) /n, 
except for Simpson’s rule where h = Ax. We have to replace f(0) by a bound 
M, on |f“| for the entire interval. Using 2nh = (b — a), we find the absolute 
value of the errors have the following behavior in terms of Ax: 


Left: order 1 with bound (b — a)(1/2)M, 
Right: order 1 with bound (b — a)(1/2)M, 
Midpoint: order 2 with bound (b — a\(1/24)M, 
Trapezoid: order 2 with bound (b — a)(1/12)M, 
Simpson: order 4 with bound (b — a)(1/180)M,. 


The typical textbook problem on numerical integration is to find the value of n 
that guarantees the error is within a specified tolerance. In practice, one simply 
keeps doubling n until the desired number of digits seems to have stabilized. Thus, 
error behavior, rather than error bounds, may be what we really are interested in. 
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For example, it is useful to know that increasing n by a factor of 10 for the Left or 
Right rule, decreases the error by a factor of 1/10, that is, it gives one more 
significant digit. Thus if it takes 1 second for a graphing calculator to compute an 
integral accurate to 2 digits using the Left or Right Rule, it will take 10’° seconds 
to get 12 digits of accuracy (that’s 3169 years and, as my students have observed, a 
lot of batteries). By contrast, Simpson’s Rule gets 4 extra digits for 10 times the 
work, and the same integral can be computed to 12 digits of accuracy in a minute 
or two on the same calculator (Simpson’s Rule probably would get a headstart of 4 
or 5 digits in the first second). 

The dependence of error behavior on the higher derivatives of the integrand is 
also important, because it is a warning to look out for integrals whose integrand 
has an unbounded derivative on the interval of integration. For example, even 
using Simpson’s Rule on / 7V1 — x’ dx to get an approximation for 7/4 is painfully 
slow going. Indeed, the order of convergence is 3/2 rather than 4. 

Taylor series can be used in the same way to analyze the error behavior for 
numerical differentiation approximations: 


f(x +h) — f(x) 


f'(x) = ; 

x+h)-f(x-h 
p(x) = ie 

x+h x-—h)- x 
pcx) = LPM) A=) = 2F(2)_ 


h2 

For example, students are often curious why some graphing calculators use the 
second of these approximations as a numerical derivative rather than the more 
familiar first approximation. Taylor series give the answer immediately: the second 
error for the approximation is order 2 while the first is order 1. The dependence of 
the error of each approximation on higher derivatives of f also has interesting 
effects. Try plotting the error near x = 0 with h = .01 for the second approxima- 
tion to f’, when f is the innocuous-looking function f(x) = x°”. 


6. THE FUNDAMENTAL THEOREMS OF CALCULUS. The proof given in [8] 
for the Taylor error bound appeals to the Fundamental Theorem of Calculus to 
turn the inequality f”*'(x) < M into the inequality f(x) — f(a) < M(x — a). 
I suspect this is the natural inclination of most mathematicians, and it shows how 
much under-appreciated the IFT is. No definite integrals are needed; the IFT 
itself is a disguised form of integration. The subtle connection between the IFT 
and the Fundamental Theorem of Calculus is worth discussing. 

There are of course two main versions of the Fundamental Theorem of 
Calculus. There are also variations on what restrictions are placed on the inte- 
grand f. I will assume f is continuous. The theorems then are 


First Fundamental Theorem of Calculus (FTC I). Jf f is continuous on the interval 
[a, b] and F(x) = [*f(t) dt for x in [a, b], then F'(x) = f(x). 


Second Fundamental Theorem of Calculus (FTC II). Jf f is continuous and F(x) = 
f(x), then [?f(t) dt = F(b) — F(a). 


The First Fundamental Theorem is not directly related to the IFT. The hard 
part of the proof is showing that continuous functions are Riemann integrable. The 
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rest is a straightforward consequence of the integral version of the Mean Value 
Inequality: 


m(b —a) < fF) dx < M(b — a), 


where m < f(x) < M on the interval [a,b]. Note that unlike the Mean Value 
Inequality for derivatives, this inequality follows easily from the definition of the 
Riemann integral, so easily that it is not uncommon to view the inequality as a 
defining property of the definite integral (the corresponding view for the Mean 
Value Inequality for derivatives, Ampére notwithstanding, is much less common). 

On the other hand, the Second Fundamental Theorem is closely connected to 
the IFT. The IFT for continuously differentiable functions follows directly from 
the FTC II and the fact that the integral of a nonnegative function is nonnegative. 
In fact, that is the way the IFT is proved in [8]. There, the FTC II, as embodied in 
the relation between velocity and change in position, is taken as the intuitively 
clear, theoretical cornerstone, and the IFT is derived from it. I suspect, however, 
that most students see the IFT as more “obvious” than the FTC II. 

Conversely, the IFT implies the FTC II by the method used in many calculus 
books: simply invoke the FTC I with x = b and observe that, by the IFT (the 
Constant Function Theorem, Theorem 1b), two antiderivatives of f differ by a 
constant. 

The assumption of continuity in the FTC I is necessary. The assumption of 
continuity in the FTC II is another matter. Of course, if F’ = f is not continuous, 
the integral might not exist. For example, if F(0) = 0 and F(x) = x’sin( /x*) for 
x #0, then F’ exists everywhere but is not even Lebesgue integrable on [0, 1]. 
Suppose, however, that we assume only that [?f@) dt exists. Then the familiar 
argument using the Mean Value Theorem still works. Just represent F(b) — F(a) 
as a telescoping sum and use the MVT on each term of the sum to turn it into a 
Riemann sum for /{,’f(t) dt. Here the IFT does not work. Just as the MVT follows 
from the IFT only under the assumption of continuity of the derivative, the FTC II 
follows from the IFT only under the assumption of continuity of the integrand. 


7. CONCLUSION. Many calculus textbooks have sections where the author is 
writing on automatic pilot, just putting in material demanded by users. These 
sections have the same dreary examples; little is new, or thought over fresh from 
the start. This shouldn’t be surprising, since writing a calculus textbook is a 
significant project and one can’t devote the same enthusiasm and energy to all 
parts of the project. I have always felt that the theoretical sections of standard 
calculus textbooks are most prone to such a pedestrian treatment. Moreover, 
calculus instruction does not place much emphasis on those theoretical sections, at 
least when it comes to testing. For example, a study of the compendium of final 
exams in [11] reveals only one question (out of more than 300 on 23 exams) 
involving the Mean Value Theorem, and that one asked for the value of c 
satisfying the conclusion of the Mean Value Theorem for a quadratic function. 
When both textbooks and instruction appear to be just going through the motions 
with theory, it surprises me that some critics of new textbooks like [8] bemoan the 
absence of the Mean Value Theorem or a e — 6 definition of limit. 

I sympathize with yearnings for an occasional foray into the theoretical struc- 
ture of the calculus. I just ask that it be thoughtful and sensible. Use intuitive 
definitions. If a theorem is to be used without proof, like the Mean Value 
Theorem, keep it as simple and as “‘obvious” as possible. Don’t use tricky proofs or 


1997] THE ROLE OF THE MEAN VALUE THEOREM 239 


deus-ex-machina auxiliary functions. Don’t prove things in more generality than 
necessary; even analysts don’t usually deal with the discontinuous derivatives 
allowed by the Mean Value Theorem. 

In this paper, I have tried to give a sensible approach to the Mean Value 
Theorem and its usual applications to monotonicity, Taylor error bounds, quadra- 
ture error bounds, and the Fundamental Theorems of Calculus. One standard 
application of the MVT I have not considered is lHopital’s Rule; for a non-MVT 
approach, see [3]. LBL [9] has some other applications to concavity and the second 
derivative test for extrema. 

In recent years, calculus content and pedagogy have been rethought completely. 
People have found that there is nothing sacred about related rates and the lecture 
method. It is time as well to rethink the theory taught in standard calculus classes. 
There is nothing sacred about the Mean Value Theorem. 


ACKNOWLEDGMENTS. I wish to thank Andy Gleason, Peter Lax, and Jerry Uhl for numerous 
suggestions and corrections for this article. In particular, the proof given for the IFT was instigated by a 
bisection proof Lax showed me for the SIFT. He also showed me applications to l’Hopital’s Rule, the 
Corrected Midpoint Rule for quadrature, and the definition of volumes and arclengths using antideriva- 
tives rather than definite integrals; all of this I hope he puts into print. I am indebted to Gleason, 
whose meticulous reading caught a number of egregious errors and whose comments cleared up my 
muddy thinking at numerous points. 


REFERENCES 


1. L. Bers, On avoiding the mean value theorem, Amer. Math. Monthly 74 (1967), 583. 
2. D. Bo, A simple derivation of the Maclaurin series for sine and cosine, Amer. Math. Monthly 97 
(1990), 836. Editor’s Note in the Monthly 98 (1991), 364. 
3. R. P. Boas, Lhospital’s rule without mean value theorems, Amer. Math. Monthly 76 (1969), 
1051-1053. 
4. R.P. Boas, Who needs these mean-value theorems anyway?, Two-Year College Math J. 12 (1981), 
178-181. 
5. W. Davis, H. Porta, J. Uhl, Calculus & Mathematica: Derivatives. Measuring Growth, Addison- 
Wesley, 1994. 
6. J. Dieudonne, Foundations of Modern Analysis, Academic Press, New York, 1960. 
7. J. V.Grabiner, The Origins of Cauchy’s Rigorous Calculus, MIT Press, Cambridge, 1981. 
8. D. Hughes-Hallett, A. M. Gleason, et al., Calculus, John Wiley & Sons, New York, 1994. 
9. P. Lax, S. Burstein, and A. Lax, Calculus with Applications and Computing, Volume 1, Springer- 
Verlag, New York, 1984. 
10. D.E. Richmond, An elementary proof of a theorem of calculus, Amer. Math. Monthly 92 (1985), 
589-590. 
11. L. A. Steen, editor, Calculus for a New Century: A Pump, Not a Filter, MAA Notes 8, 
Mathematical Association of America, Washington, DC, 1988. 


Department of Mathematics 
Colgate University 
Hamilton, NY 13346 
ttucker@center.colgate.edu 


240 THE ROLE OF THE MEAN VALUE THEOREM [March 


Commentary on Rethinking Rigor 
in Calculus: The Role of the Mean 
Value Theorem 


Howard Swann 


Professor Tucker’s article joins the current deconstructive attack on traditional 
content and methods of teaching of calculus that seems to be part of the mission of 
the militant wing of the ‘Calculus Reform Movement.’ Here the primary targets 
are current textbooks’ efforts to present the foundations of calculus and the 
frequent use of the mean value theorem. 

As the author remarks, the traditional presentation of the foundations of 
calculus is often poorly motivated and incomprehensible to most students. So in 
reforming the teaching of the calculus sequence, one should either omit the logical 
foundations or attempt to make them interesting and comprehensible. The author, 
who is one of the co-authors of the ‘Harvard Calculus’ text [2] where the first 
option is chosen and the concept of mathematical proof based on rigorous 
definitions is eliminated entirely, urges that we keep things as “intuitive ..., simple 
and obvious as possible.” Various demonstrations are our new “proofs;” I use the 
quotation marks to make the distinction. The author’s favored replacement for the 
Mean Value Theorem (MVT), the Increasing Function Theorem (IFT), finds its 
intuitive justification in an automotive (‘Racetrack’) argument. Such automotive 
arguments are a new addition to our pantheon of “proofs.” An automotive “proof” 
of the IFT is ‘if the speedometer on a motor-car always reports a number greater 
than or equal to zero, then the car must be moving (weakly) forward.’ The IFT is to 
be treated as an ‘axiom,’ yet the essential first foundational question for calculus is 
“What is it that a speedometer is supposed to report?’ Intuition falters here, for 
nature has yet to provide us with a speedometer. 

The author states, “The origin of The Mean Value Theorem in the structure of 
the real numbers is not addressed; that is much too difficult for a standard course.” 
I agree that proofs of the extreme value theorem and other global results from 
basic principles do not belong in today’s beginning calculus texts in the present 
educational climate. However, an informal discussion of human attempts to define 
‘number’ is fascinating and accessible. 

For example, although we currently use the ‘real numbers,’ today’s students, 
brought up on Star Trek, are delighted with the realization that there still is the 
following problem: When we use real numbers to represent time ¢ and position 
p(t), we are led to the conclusion that in moving from p(t’) to p(t”) we must 
disappear infinitely often, for there is NO instant of time ‘next to’ ¢’ nor any 
position ‘next to’ p(z’). A variant of this problem bothered Zeno 2500 years ago; it 
has not been resolved; the reals are indeed ‘full of holes.’ Why shouldn’t today’s 
students again contemplate this version of the abyss confronting human attempts 
to comprehend infinity, particularly when Weierstrass has contrived a remarkably 
clever way across? 

For Weierstrass, in treating continuity and differentiability, insisted that we 
consider only functions that are accompanied by suitable ¢, 6 arguments. In this 


1997] COMMENTARY ON THE ROLE OF THE MEAN VALUE THEOREM 241 


class of functions he was able to show uniqueness (and thus define ‘correctness’) of 
guesses for limits and derivatives [1]. These notions are accessible to students and 
give the foundations for differential calculus. When we add the global results, the 
implications so astonished Bertrand Russell that he pronounced [4, p. 64]: 


...all goes smoothly until we reach those studies in which the notion of 
infinity is employed—the infinitesimal calculus and the whole of higher 
mathematics. The solution of the difficulties which formerly surrounded the 
mathematically infinite is probably the greatest achievement of which our age 
has to boast. 


Learning to understand and appreciate proofs is a gradual process; it surely is 
imperative to introduce the notion of mathematical proof in beginning mul- 
tisemester calculus and keep it alive even though actual proofs are few. Such an 
introduction is essential for later mathematics courses, and students must be made 
aware that the assertions of mathematics can be proved to be true. 

The bright promise of the new technology gives us a chance to explore these 
ideas in a striking way. For example, using—say— Mathematica, ‘zoom’ the graphs 
of f(x) =|x|sin(1/x) (continuously extended) and g(x) =x?"/@"~» sin(1 /x) 
(continuously extended), search for possible ‘local linearity’ and try to decide if 
they are differentiable at zero. For n > 4, the graph of [g(x) — g(0)]/x has a 
delightful fractal quality when you magnify the domain around zero; we give 
examples in Figure 1. These are not ‘important’ functions in a practical sense, but 
a look at such graphs encourages a sense of delight and wonder concerning the 
difficulties of the foundations of analysis. It does not take very much time to 
present and discuss these ideas. 


-0.6 
g(x) — 8(0) &(x) — (0) 


; . . > ~ O01l<x< .0l 
x—-0 x—-0 0 * 0 


Figure 1 
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As for the mean value theorem, the author states “And by the way, there is 
nothing obvious about the MVT without the hypothesis of continuity of the 
derivative.” I believe that this is not true, for here is a pictorial “proof” of the 
MVT: 


MEAN VALUE THEOREM. If f(x) is continuous on [a, b] and has a derivative on 
(a, b), then there is some point c, a < c < b, such that 


b) — f(a 
re) LOO 


Pictorial “ proof: Intuitively, the assumptions of the theorem mean that the graph 
of f(x) is smooth between (a, f(a)) and (b, f(b)) and presumably has no sharp 
corners since f has derivatives. If the graph of f(x) is not a straight line, some of 
the graph will be above the line through (a, f(a)) and (b, f(b)) or below this line. 
Suppose some of the graph is above the line. Imagine a line that is parallel to the 
line through (a, f(a)) and (b, f(b)) but far above the line. Move it down toward the 
line, keeping it parallel to the line through (a, f(a)) and (b, f(b)). Since there are 
no corners on the graph, when the line first hits the graph at some point (c, f(c)), 
surely it will be tangent to the graph at such a point. So, if our definition of the 
derivative as the slope of a line that is tangent to the graph at (c, f(c)) is any good, 
the slope of this tangent line must be f’(c). But since the line is parallel to the line 
through (a, f(a)) and (b, f(b)), it will have the same slope as this line, i.e. 


f'(e) = (f(b) — f(4))/(4 — 4). 


= slope of line through (a, f(a)) and (b, f(b)). 


A similar argument holds if some of the graph is below the line. = 
f(b) _ 
| f(b) - f(a) 
: | rise 
| 
|| 
FA) fi ~~ 4 pence 
/ nee a Dad | snetecnentenessenereneessene seen > 
| 
| 
| 
a c b 
Figure 2 
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It is to be hoped that the phrases ‘presumably’ and ‘if the definition is any good’ 
make the students suspicious. This should lead to consideration of a special case 
(Rolle’s theorem) where it is clear that the only assumption necessary is the 
extreme value theorem. This in turn invites a discussion (no proofs) of the extreme 
value theorem as one of the crucial global tests of the effectiveness of Weierstrass’ 
definition of continuity. The boundedness theorem, the extreme value theorem, 
and the intermediate value theorem are no longer ‘obvious’ when students have 
realized that the problems with the ‘holes’ in the real numbers extend to 
any intuitive sense of continuity of a function on an interval. For example, is 
Weierstrass’ definition of continuity strong enough to force a continuous function 
to be bounded on a closed bounded interval? The announcement that the answer is 
‘yes’ is an excellent promotional preview of later courses. At any rate, once 
students have ‘bought’ Rolle’s theorem, we can then use the conventional proof to 
show that the MVT must hold. 

Here we reverse the author’s prescription for giving mathematics a bad name; 
such a sequence of arguments reveals the charm and power of mathematics, for we 
prove that a questionable complicated result must be true if we assume other 
simpler results that are less questionable. 

The author offers us a mathematical proof of the Increasing Function Theorem, 
“easier than most proofs of the MVT”’’. He presents the theorem as 


If f’ = 0 on an interval, then f is increasing on that interval. 


We infer from the proof that the interval is [a, b], closed and bounded, and that 
we are to have one-sided derivatives at a and b. 

The key observation for the author’s proof is the following: Given f, let 
slope(a, b) = [ f(b) — f(@]/(b — a). The author points out that “If slope(a, b) = m 
and c is between a and b, then one of slope(a, c) and slope (c, b) is greater than 
or equal to m and one is less than or equal to m. For a proof, draw the obvious 
picture.” 

The “obvious picture” encourages this assertion, but knowing that the art of 
converting a “proof” to a proof is one of the key skills our majors should learn, if 
we are giving a proof here, we must go further. Two mathematical proofs are 
immediately discovered; a proof by contradiction (four main cases) or a direct 
proof. The direct proof shows first that the result must be true if m = 0, and then 
uses the same ‘deus-ex-machina’ auxiliary function that annoys the author when it 
is employed to prove the MVT from Rolle’s Theorem. 

The author admonishes us: “... previous calls to downplay the MVT have fallen 
on the deaf ears of textbook writers. Maybe calculus reform has unblocked some 
ears and it is time to try the call again.” 

I borrow a phrase from the author and wave Occam’s Razor, that ‘principle of 
parsimony,’ “like a cross in front of a vampire, to hold the” attack on the mean 
value theorem at bay. 

The mean value theorem is actually a friendly theorem; what has it done to 
provoke such ire? It provides one more test of the effectiveness of Weierstrass’ 
definition of limit and continuity and is used endlessly to establish all sorts of 
results; more, by the author’s admission, than the IFT. For example, how do we 
prove that the formula for arc length is correct without the MVT? We do not prove 
it, apparently; with the wave of the symbol ‘~ ,’ we make ourselves content with 
the arguments of “Newton, Leibnitz, the Bernoullis and Euler.” 

Which is more intuitively ‘obvious’ and persuasive; the pictorial “proof” of the 
MVT that we sketched above, or an automotive “proof” of the IFT? 
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A one-line proof shows that the author’s IFT follows from the MVT. The 
author’s suggested proof for the MVT from the IFT requires, in addition to the 
usual assumptions for the MVT, that the function’s derivatives be extendible to a 
continuous function on the closed interval, requires the extreme value theorem 
and the intermediate value theorem, and fails to establish that the sought-for value 
for c is strictly between points a and b. This is essential, for example, for showing 
that we can repeat the application of L’Hospital’s rule a second time in evaluating 
a limit. 

One positive note: The author’s argument (Theorem 2) for an error bound for 
Taylor’s series is elegant and should be adopted by one and all. 

However, I do not find the main arguments of the paper to be persuasive. Those 
of us who, as the author says, “bemoan the absence of the Mean Value Theorem 
or the ¢, 6 definition of limit” regret that “it is time...to rethink the theory 
taught in standard calculus classes.” Some of us are disappointed that “there is 
nothing sacred about related rates;” we used to regard them highly, for related 
rates give us the heat equation, one of the classic models of mathematical physics, 
and the primary example for the study of elliptic and parabolic partial differential 
equations. 

Whether or not the militants’ ‘final product’ is ‘better,’ which is by no means 
established [3], one thing is clear: books such as the “Harvard Calculus” are 
“enablers;’ by legitimizing the abandonment of the concepts of mathematical 
proof, related rates, convergence of series, and so forth from the calculus se- 
quence, other texts and teachers will feel free to follow. 

Mathematics is unique in its concern with rigorous foundations and proofs. 
Here its role as ‘Queen and servant of the Sciences’ is to offer the content of 
calculus as an anchor of certainty to aid the disciplines it serves. Should we not 
attempt to convey some sense of the remarkable way that the results of calculus 
can be proved to be true to those who will use it? 
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Fermat’s Last Theorem, the Four Color 
Conjecture, and Bill Clinton 
for April Fools’ Day 


Edward B. Burger and Frank Morgan 


1. INTRODUCTION. We often fool our students. We fool them by stating and 
proving theorems that are in a polished and final form. As a result, students are 
often unaware of the evolution of ideas, strategies, statements, proofs and even 
important mistakes that inevitably lead to the beautiful theorems and proofs they 
encounter. There is much value in understanding early ideas and mistakes. In 
some cases, mistakes lead to the discovery of powerful and deep new mathematical 
truths. Thus we believe that it is occasionally appropriate to celebrate mathemati- 
cal mistakes and we cannot imagine a better time for such a celebration than on 
April Fools’ Day. Here we describe and outline three erroneous proofs from the 
19th century. We provide “proofs” of Fermat’s Last Theorem, the Four Color 
Conjecture, and the fact that one of us is Bill Clinton. This paper is based upon a 
special undergraduate mathematics colloquium the authors gave on April 1, 1996. 


2. FERMAT’S LAST THEOREM. The problem that was to become one of the 
most infamous open problems in mathematics had a most inauspicious beginning. 
Circa 1637, Pierre de Fermat (Figure la), while studying Bachet’s Latin translation 
of Arithmetica by Diophantus, came upon a discussion of the Pythagorean theo- 
rem. This inspired Fermat to write the following, now famous, lines in the margin: 


“Tt is impossible to separate a cube into two cubes, or a biquadrate into two 
biquadrates, or generally any power except a square into two powers with the 
same exponent. I have discovered a truly wonderous demonstration of this, which 
this margin is too narrow to contain.” 


Fermat was notorious for making such statements with usually little or no 
justification or proof. By the 1800’s all of Fermat’s statements had been resolved; 
all but the above one—his “last” one. 


Fermat’s Last Theorem. For any integer N = 3, there are no integer solutions to 
aN +yN=2%, with xyz #0. 


We do point out that Fermat himself did provide a complete proof of the above 
result in the case of N = 4. His proof involved a clever idea now known as 
Fermat’s method of descent. The theme of ‘method of descent’ is to hypothesize 
that there are positive integer solutions to the problem at hand and then to use 
those solutions to construct another set of positive integer solutions that are, in 
some sense, smaller than the hypothesized solutions. Iterating this procedure 
indefinitely leads to a contradiction since there are only finitely many positive 
integer solutions that are smaller than the hypothesized solutions. 
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On March 1, 1847, Gabriel Lamé [9] (Figure 1b), who had already proved the 
result in the case N = 7, announced to the Paris Academy that he had a complete 
proof of Fermat’s Last Theorem. However Joseph Liouville [10] quickly pointed 
out a Serious error in the argument. The following outline of a “proof” of Fermat’s 
Last Theorem contains some of the same mistakes that Lamé made. Can you find 
the errors? 

Before embarking upon our “proof” of Fermat’s Last Theorem, we define some 
notions that play a major role in the argument to follow. Let p be an odd prime 
and consider the polynomial f,(x) =x? — 1. If we write ¢, = e*™'/P, then since 
the powers of ¢, are zeros of f, we easily conclude that 


Ppl) = (%— B)(% — Se ~ S) (a SP"), 


The key to the following argument is the idea of extending the notion of integer 
and employing the arithmetic of generalized integers. We define the ring of p 
cyclotomic integers Z| ¢,] by 


p-1 
Z| g,| = | an Sa, <7} 


=0 


For a, BE Zl ¢,], we say that a divides B, denoted as a|B, if there exists an 
element y © Z[¢,] such that 6 = ay. In this case we say that a@ is a factor of B. 
An element w € Z[Z,] is called a unit if wl|l. For example, ¢; is a unit for any 
t © Z. A nonzero element 7 € Z[¢,] is a prime if 7 is not a unit and whenever 
T = W1W 7, @,, ©, © AG], then either w, or w, is a unit. 


“Proof” of Fermat’s Last Theorem. By factoring the exponent N into primes, it is 
not hard to see that it suffices to prove Fermat’s Last Theorem for N = 4 and for 
N an arbitrary odd prime. As we noted before, Fermat himself proved the case 
N = 4, thus we need consider only the case N = p, where p is an odd prime. In 
1770, Euler proved the result for p = 3 and hence we may assume that p is a 
prime greater than 3. 

We prove the theorem by contradiction. That is, we assume that there is an 
integer solution to 


xP +yP=z?, ~~ with xyz # 0. (1) 


By dividing out any common factors, we may assume that x, y, and z are pairwise 
relatively prime. We now consider the two possible cases. 


Case 1. The prime p does not divide xyz. 
Case 2. The prime p does divide xyz. 


In order to analyze these cases, we make the following fundamental observa- 
tion: 


y 
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Thus, 


p-l 
et (x+ Oy) =2?. (2) 


Perhaps not surprisingly, the arithmetic of the cyclotomic integers Z[Z,] is 
analogous to the arithmetic of ordinary integers Z. For example, two elements 
a, B © ZG.) are said to be relatively prime if they have no common factors other 
than units. Given this, it turns out that in Case 1, the factors occurring in (2) are 
pairwise relatively prime. In Case 2, however, it turns out that all the elements 
occurring in the product (2) have as a common factor the prime (, — 1), and once 
we divide out this factor from each element, the remaining cyclotomic integers are 
pairwise relatively prime. 

Suppose our solution to (1) lies in Case 1. In this case, because the factors in the 
product of (2) are pairwise relatively prime, each element in the product must be a 
perfect p power of a cyclotomic integer multiplied by a unit. In particular, there 
must exist a nonzero cyclotomic integer w and a unit e so that 


x+ Oy = €o”. 


After some work and computation, this reveals that x = y mod p. 

Thus, from our original equation x? + y? + (—z)? =0, we have discovered 
that x =y mod p. By symmetry, our argument may be reworked to deduce 
x = —z mod p and y = —z mod p. These congruences together with Fermat’s 
Little Theorem, which states that n? =n mod p for any integer n, reveal 


O=xP? +y? +(-z)* =x+y+(-z) = 3x mod p. 


Therefore, p divides 3x. As p is a prime greater than 3, p must divide x, but this 
contradicts the assumptions of Case 1. Hence Case 1 is impossible. 

Thus our hypothesized solution to (1) must satisfy Case 2. In this case we know 
that each of the elements in the product (2) has a common prime divisor of 
(¢, - ) and cyclotomic integers (x + yXZ,- DY, + GyMG - Dt, 
(x + Sey, — Dt,...,€a + LP ty, — 1) are pairwise relatively prime. By 
an analysis similar to the one in Case 1, we must have 


Ch -1 

(x + S'y)(g, — 1) = en @7, (3) 
where «, 1S a unit and w, is a cyclotomic integer for n = 0,1,2,..., p — 1 such 
that w , @,,...,@,_, are all pairwise relatively prime. Using basic algebra and 


arithmetic in Z[ al one may deduce the following identity free of the variables x 
and y: 


wf + (7,@,_1)° — T( op ~ 1)? y?, (4) 


where 7, and 7, are units in ZG], ye ZG] with o,, 1, y, (% — 1) all 
pairwise relatively prime, and ¢ € Z is greater than 1. Hence we have just found 
pairwise relatively prime cyclotomic integers X, Y, Z all relatively prime to (¢, — 1) 
such that 


X? + Y? = v(Z,-1)"2?, (5) 


where v is a unit and ¢ € Z satisfies t > 1. 
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We now employ Fermat’s method of descent. We begin by noting that we may 
factor the left-hand side of (5) as before and deduce that 
p-l1 
TT (x + gt¥) = o(g,—1)°Z". 
In this case again each factor X + £”Y is divisible by (¢, — 1). We now argue as 
before to deduce that there exist pairwise relatively prime cyclotomic integers 
X,,Y,, Z,, all relatively prime to (g, — 1), and a unit v, so that 


XP + YP = v,(g, - 1)""Zp, 


where ¢, is an integer such that 1 <t, <t. Hence we have a solution to an 
equation that is very similar to (5). The key difference is that in the exponent of 
(Z, — 1) we now have the integer rf, rather than the integer t, where 1 <t, <t. 
We may repeat this procedure to produce pairwise relatively prime cyclotomic 
integers X,,Y,, Z,, all relatively prime to (g, — 1), a unit v,, and an integer f,, 
1 <t, <t,, so that 


XP + YP = v,(g, - 1)" Z2. 


Repeating this process produces an infinite sequence of decreasing integers all 
between 1 and t: 1< -:: <t,< ++ <¢t, <t, <t. This absurdity implies that 
Case 2 is impossible. Hence there must be no nonzero integer solutions to 
xP + y? =z? and this completes the “proof” of Fermat’s Last Theorem. Where 
did we go wrong? 


The flaw. It turns out that our sentence, ‘Perhaps not surprisingly, the arithmetic of 
the cyclotomic integers Z| £,| is analogous to the arithmetic of ordinary integers Z,” is 
inaccurate in One very important respect: in general, cyclotomic integers do not 
have unique factorization into primes. If you think about it, that was the key step 
that allowed us to conclude that each element in the product was a perfect p 
power; recall our statement, “because the factors in the product of (2) are pairwise 
relatively prime, each element in the product must be a perfect p power of a cyclotomic 
integer multiplied by a unit.” In fact Lamé himself wrote [9, p. 314]: 


Now, if one wants to make the product kK? mm'm" --- m‘’’~) equal to the pth 
power of a complex number C, it is necessary that the numbers 
m,m',m",...,m’—, that do not admit a common divisor, even taken two at 
a time, be equal to pth powers, respectively. 


Liouville then wrote [10, p. 319]: 


Nonetheless, some initial investigations lead me to believe that one should 
first try to establish, for the new complex numbers, a theorem similar to the 
elementary proposition for the ordinary integers, namely that there is only 
one way to decompose a product into prime factors. 


The smallest prime for which Z[,] does not satisfy the unique factorization 
property is p = 23. In particular, it is a straightforward calculation to verify that 


(1 + £53 + £33 + ox + o23 + bas) + bos )(L + bag + ox + 03 + O25 + oo + bs 
= 203(1 + £3 + o3 + 63 + 63 + 3034+ 054+ O38 + G3 + Os + ox ). 
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It turns out that 2 is prime in Z[ Z,,] and does not divide either of the two factors 
on the left-hand side of the above identity. 

There is another, somewhat more technical, problem in deriving identity (4): we 
need to know that given any unit 7), there exists a unit 7, such that 7, = 7/. This 
is not only not obvious, but sometimes impossible. 

Ernst Kummer discovered that the preceding problems do not arise in certain 
cases. In particular, he called a prime p a regular prime if p does not divide the 
class number of Z[¢,] (denoted by h). The class number h is a positive integer 
that, in a delicate sense, measures how far the ring is from having the unique 
factorization property (i = 1 if and only if we have unique factorization into 
primes). Thus the “proof” we have outlined is actually a correct proof of Fermat’s 
Last Theorem for regular primes; this result is due to Kummer [8] (see [3], [11] for 
further details and a complete proof). Notice that if Z[¢,] satisfies the unique 
factorization property, then h = 1 and plainly p does not divide h. Thus, if Z[Z,] 
has the unique factorization property, then Fermat’s Last Theorem is true for the 
prime exponent p. 

Kummer’s contributions in the direction of Fermat’s Last Theorem had tremen- 
dous ramifications. In May of 1847, Kummer [7] wrote a letter to Liouville and 
stated: 


Concerning the elementary proposition for these complex numbers, that a 
composite complex number may be decomposed into prime factors in only 
one way, which you so correctly cite as lacking in this proof—a proof 
defective in other ways as well—I can assure you that it does not hold in 
general for complex numbers of the form 


2 —1 


but it is possible to rescue it, by introducing a new kind of complex number, 
which I have called an ideal complex number. 


Kummer’s observation led to the birth of what we now call ideals. Finally in 
1994, Andrew Wiles (Figure 1c) [12], using powerful machinery from abstract 
algebra and the theory of elliptic curves, accomplished the momentous feat of 
producing a complete and correct proof for all primes. 


Figure 1. (a) P. de Fermat; (b) G. Lamé; (c) A. Wiles. 
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3. THE FOUR COLOR CONJECTURE. In 1852, Francis Guthrie asked whether 
every planar map (as in Figure 2) with connected countries can be colored with at 
most four colors, so that adjacent countries have different colors. It was finally 
proved by Kenneth Appel and Wolfgang Haken [1] of the University of Illinois in 
1976 with over a thousand hours of computer time, checking over ten thousand 
cases (Figure 3a). 


Figure 2. The Four Color Conjecture says that a planar map with connected countries can be colored 
with four colors such that bordering countries have different colors. 


But the first published “proof” appeared in 1879, by Alfred Kempe [6], a 
London barrister and amateur mathematician (Figures 3b, 3c). The result stood for 
eleven years, until Percy Heawood [5] caught the error in 1890. Heawood con- 
fessed his paper’s aim was “rather destructive than constructive, for it will be shown 
that there is a defect in the now apparently recognized proof.” 


Figure 3. (a) W. Haken (and K. Appel) finally proved the Four Color Conjecture in 1976 with over a 
thousand hours of computer time (photo courtesy of the American Mathematical Society); (b) Alfred 
Kempe gave the first published “proof” of the Four Color Conjecture in 1879; (c) Plate from Kempe’s 
1879 proof of the Four Color Conjecture [6]. 
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Kempe’s Proof of the Four Color Conjecture. We proceed by induction on the 
number of countries. We remark that if the result holds for countries that meet 
only in threes at a point, then it holds in general, since for example if countries 
meet in fours at a point, they can be perturbed as in Figure 4 to meet only in 
threes, then colored, then restored. (Countries diagonally across from each other 
are allowed to be the same color.) So we may assume if we like that countries meet 
only in threes. 


Figure 4. Countries meeting in fours can be perturbed to meet in threes. 


The conjecture obviously holds for one country. We will suppose the conjecture 
holds for m — 1 countries and prove it for m countries. It is not hard to show 
(using the concept of Euler characteristic, for example), that some country, say 
“Massachusetts,” has at most five neighbors. Shrink away Massachusetts as in 
Figure 5, and color the rest by induction. Now restore Massachusetts and find a 
way to color it by the following cases. 


Massachusetts 


Figure 5. Shrink away the mth country and color the rest by induction. 


Case 1. Massachusetts has fewer than four neighbors. Simply color Massachusetts 
with an unused color. 


Case 2. Massachusetts has exactly four neighbors. We may assume they are four 
different colors, say green, red, blue, and yellow as in Figure 6, or we could color 
Massachusetts with an unused color. If there is no red-yellow chain from the top to 
the bottom, starting at the top, switch the colors of all connected reds and yellows. 
Then color Massachusetts red. If there is a red-yellow chain from top to bottom, 


252 FERMAT, FOUR COLOR, AND CLINTON FOR APRIL FOOLS’ [March 


then there cannot be a not blue-green chain from left to right. Starting at the left, 
switch all connected blues and greens. Then color Massachusetts blue. 


YN Y 
R R~ 


B Massachusetts 


Y 


Y 


Figure 6. If there is a red-yellow chain, starting from the left switch all connected blues and greens and 
then color Massachusetts blue. 


Case 3. Massachusetts has exactly five neighbors. Again we may assume that the 
neighbors use all four colors, as in Figure 7. If there is no green-yellow chain from 
left to right, starting from the right, switch all connected yellows and greens and 
then color Massachusetts yellow. If there is no green-blue chain from left to right, 
starting from the right, switch all connected blues and greens and then color 
Massachusetts blue. If there are both green-yellow and green-blue chains, starting 
at the top switch all connected reds and blues and starting at the bottom switch all 
connected reds and yellows. Then color Massachusetts red. 


Ga YS 


\ G 
Gp 


Figure 7. If there are both green-yellow and green-blue chains, starting at the top switch all connected 
reds and blues and starting at the bottom switch all connected reds and yellows. Then color 
Massachusetts red. 
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That completes Kempe’s “proof.” Before reading on, can you find the flaw? 


The flaw. The problem is that the two chains of Case 3 might cross as in Figure 8, 
so that the regions they enclose can overlap. Then switching reds and blues from 
the top may interfere with switching reds and yellows from the bottom. You might 
think you could still salvage the proof, but it is just not that easy. For a nice 
account of the Four Color Conjecture, see [13]. 


Figure 8. The unjustified assumption in Figure 7 is that the chains do not cross, but they might as 
illustrated here. 


Incidentally, the Four Color Conjecture technically does not apply to the 
continental United States, even if you include water to connect a state such as 
Michigan, because Kentucky has a little disconnected piece surrounded by 
Tennessee and Missouri. If you do not believe us, just try to drive from Louisville 
all the way southwest to the tiny New Madrid Bend region without leaving 
Kentucky. 


4. 1 AM BILL CLINTON. We now illustrate how one can “prove” anything by 
proving that “I am Bill Clinton.” Consider the statement: 


This statement is false or I am Bill Clinton. 


If the statement is false, then the first clause is true, so the statement is true, 
which is a contradiction. Therefore the statement must be true. Since the first 
clause is false, the second clause must be true. Hence I am Bill Clinton. (See 
Figure 9.) 

Actually 19th Century mathematics permitted one to make such arguments. 
Mathematics was restructured so as to forbid such self-referential statements, 
founded on a much more technical Zermelo-Frankel set theory, which does not 
allow sets to contain themselves; see [4, Section 3.4] or [2]. 
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Figure 9. Authors Edward Burger and Frank Morgan and an unidentified third claimant to “I am 
Bill Clinton.” 
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NOTES 


Edited by Jimmie D. Lawson 


Euler’s @ Function 
on Arithmetic Progressions 


D. J. Newman 


We will study the ¢@ function’s behavior on arithmetic progressions. The numerical 
evidence is extremely misleading in many cases. Thus it was pointed out by Dov 
Jarden in his book Recurring Sequences, Riveon Lematematika, Jerusalem, 1973, 
page 65, that #(30n + 1) > @G0n) for all n < 10000. (Indeed, further computa- 
tions have shown that this persists up to 20,000,000.) Another case that resists is 
that of 6n + 1 vs. 6n + 2, where again, the inequality ¢(6n + 1) > d(6n + 2) 
holds well into the millions. 

At least, in this case, we were able to explicitly produce the smallest n predicted 
by our theorem: it is 6,197,024. In Jarden’s case and in many others, the n is not 
explicitly available and it may be beyond the reach of any possible computers! At 
any rate we can prove the following. 


Theorem. Jf a,b,c, d are nonnegative integers with a,c > 0 and ad — bc # 0 then 
there exists a positive integer n for which o(an + b) < (cn + d). 


Remarks: By replacing b by b + aN and d by d+ CN, N large, we see in fact 
that our theorem gives infinitely many such n. The condition ad — bc # 0 is 
certainly a necessary one since otherwise we would have the case of a=c, b=d 
which gives equality of d(an + b) and (cn + d), but even worse we would have 
the case of 6(4n) which is always strictly bigger than (7). 


Proof of the theorem: Begin with the linear Diophantine equation ax + b = Py, 
where P is chosen prime to a and with (¢(P)/P) < e, e later to be specified. For 
example, P may be chosen as the product of many consecutive primes. Notice that 
then 


o(P) 
(ax +b) = b(Py) < o(P)y = (ax +b) < e(ax +). 


Next observe that a general solution to our equation is given by x =x, + kP, 
y =y, + ka, k an arbitrary integer. This gives cx + d = cx, + d + kcP, and if we 
denote 6 = gcd(cx, + d, cP) and note that it divides a(cx, + d) — ygcP = alcxy + 
d) — c(ax, + b) = ad — bc, we obtain 6 < |ad — bc|. Factoring out 6 gives cx + d 
= 6(A + kB) where A and B are relatively prime. 

We now recall Dirichlet’s great theorem on primes in arithmetic progressions. 
This states that if A and B are relatively prime integers with B > 0, then there 
are infinitely many integers, k, for which A + Bk is a prime. We may apply 
Dirichlet’s theorem to our case and thereby choose k so as to make A + Bk a 
prime, which we shall call p. 
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Now we have 


cxtd 
26 


(cx +d) = 6(6p) = o(p) =p-12=p/2= 

Since (cx + d)/(ax + b) is monotonic and positive, it has a positive minimum, m, 
SO 

c+d m(ax+b) m(ax +b) 

> 

26 26 ~~ 2lad — bel 


where e is defined by this last equation. a 


= e(ax +b), 


1830 Rittenhouse Square, Philadelphia, PA 19103 


A Note on Weyl’s Inequality 


Steve Fisk 


We present a simple inequality for the eigenvalues of Hermitian matrices that 
implies Weyl’s inequality and the monotonicity theorem. The underlying idea, to 
intersect suitably chosen subspaces to obtain eigenvalue inequalities, is not new. 
See [1] and [2]. 


Lemma 1. /f S,,..., 5S, are subspaces of an n-dimensional vector space V, and if 
dim(S,) + --- +dim(S,) > n(k — 1), then the intersection of all the S,’s is non-zero. 


Proof: Consider the map from the direct sum of the subspaces S$, to the sum of 


k — 1 copies of V that sends (v,,...,u,) to (v, — V2, U, — U3,..., 0, — U,_1). The 
intersection of all the S,’s is the kernel of this map, and has dimension at least 
dim(S,) + “ee + dim(S,) —_ (k —_ 1)n. a 


If H is an n-by-n Hermitian matrix, we denote its ordered eigenvalues by 
ACH) < + < ACA). An n-by-n matrix X is negative semidefinite if v* Xv < 0 for 
every vector v. For instance, the zero matrix is negative semidefinite. 


Theorem 1. Suppose H,,...,H, are n-by-n Hermitian matrices such that H, + H, 
+ --- +H, is negative semidefinite. Then 


A;(H,) + A;,( Aa) + +A, CA) < 0 
for alli,,...,i, € (,...,n} such that i, ++ +i, <n +k. 


Proof: Let S; be the subspace spanned by eigenvectors of H; corresponding to the 
eigenvalues A,(H;), A; ,CH)),..., A,CH;). Since 


k k 
y dim(S,) = DL) (n —i, + 1) =nk +k — (i, + + +i,) > n(k — 1), 
j=l j=l 


Lemma 1 ensures that there is a unit vector x in the intersection of all the S;’s. 


1997] NOTES 257 


Now A; (Hj) is the smallest eigenvalue of H; restricted to S;, and therefore 
\,( Hj) <x*Hjx, forj=1,...,k 
since each S; is invariant under H,. Adding these inequalities gives 
k k 
Ean) sx*[Da]xs0 | 
i=. / j=l 
Corollary 1 (Weyl’s Inequality). If A, B are n-by-n Hermitian, then 
Aj(A) + ACB) S Ajsnsi (A + B). 
Proof: Take H, =A, H, = B, H, = -(A+B), i, =j, i, =k, iz =n -—j—k. 


The result follows from Theorem 1 and the fact that A,0—C) = —A,,,_,(C) for 
any Hermitian matrix C. | 


Corollary 2 (Monotonicity Theorem). If.A, B are n-by-n Hermitian and B is positive 
semidefinite, then A,_A) < 4,;CA + B) for alli = 1,...,n. 


Proof: Take H, = A, H, = —A — B,1, =i,i1, =n —itl. a 
The choice of eigenvalues in Theorem 1 is the best possible. 


Theorem 2. [f i, + ++: +i, =>n+k, then there are n-by-n Hermitian matrices 
H,,..., H, such that H, + --- +H, is negative semidefinite and 


A; (H,) + +++ +A; (H;,) > 0. 


Proof: We may assume that i, + --- +i, =n + k. Let H, be the diagonal matrix 
whose diagonal is all 1’s, except for the entries in rows (i, + -*: +i,_,) +2 - 
s,...,(i, + +++ +i,) — s, where the entries are 1 — k. H, + --: +H, = 0 is negative 
semidefinite, and A; (A;) = 1,so A; (Hy) + oss +A; (H,) =k, | 
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Minimal Polynomials Over 
Cyclotomic Fields 


Ming-chang Kang 


1. Introduction. Throughout this note, n and m denote two positive integers 
with d the greatest common divisor of n and m. We define e by n = de, and let 


{= e2™V-1/" which is a primitive nth root of 1. 
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The minimal polynomial of £, over Q is ®,CX), the nth cyclotomic polynomial, 
defined by ®,X) = Toa gm-1(X — G, by [J, p. 264]. However, if m = 3, 
®,(X) might decompose over Q(Z,,), the field generated by £, over the rational 
numbers. In [A, p. 531], one finds the exercise: Find the minimal polynomial 
over Q(2;) of &, &,£,. respectively. The answer is: g.c.d. {®,CX), X? — 2}, 
g.c.d. {®,(X), X° — £,}, gcd. {@,(X), X4 — 2} for &,%5,¢,. respectively. 
(Hint. { is a zero of both ®,(X) = X* —X +1 and X* — £,. Hence it is a zero 
of their difference, which is X — 1 — £;.) One might wonder whether the analo- 
gous result is true in general. We show that this is so. 


Theorem 1. The minimal polynomial of ¢, over Q(Z,,) is the greatest common divisor 
of BX) and X° — ¢,. 


2. Preliminaries. The Euler g-function g(n) is the number of positive integers 
<n that are relatively prime to n. If we denote by (Z/nZ)* the group of units in 
Z/nZ, the ring of integers modulo n, it is not difficult to see that an integer k is 
relatively prime to n if and only if k is invertible in Z/nZ. In particular, y(n) is 
the cardinal number of (Z/nZ)”. 


Lemma 2 [J, p. 107]. If, and n, are relatively prime, then p(n,n,) = 9(n,)e(n,). 
Moreover, ifn = p* -*: p, where 4; = 1 and p,,°:', p, are distinct prime numbers, 
then p(n) = nIIi_,C. — p;'). 


Lemma 3 (Chinese Remainder Theorem) [J, p. 107]. If n, and n, are relatively 
prime, then Z/n,n,Z = Z/n,Z X Z/n,Z. It follows that (Z/n,n,Z)* = (Z/n,Z)* 
x(Z/n,Z)”. 


Lemma 4. Let d and d' be positive integers such that d is divisible by every prime factor 
of d'. Then (i) e(d'd) = d'e(d); and (ii) for any non-zero integer i, g.c.d. {i, d} = 1 if 
and only if g.c.d. {i, d'd} = 1. 


Proof: (i) Let d = p* -:: p* be the decomposition of d into the product of prime 
numbers with A; > 1, p; #p; if i #j. Then d'd = pf -: p, for some e, where 
é, > A, for 1 <i <r. By Lemma 2, we find g(d'd) = d'dI1/_,0 — 1/p,)) = d'e(d). 

(ii) This is easy since a prime number p divides d if and only if it divides d'd. & 


3. A proof of Theorem 1. Let / be the least common multiple of n and m, n = de, 
and d = g.c.d. {n, m}. Write e = d'e’ and m = df, where g.c.d. {d, e’} = 1 and d is 
divisible by every prime factor of d’'. Note that g.c.d. {e, f} = 1. Hence g.c.d. 
{e’, m} = 1. 


Lemma 5. Q(Z,, &,,) = QC). 


Proof: The multiplicative subgroup generated by &, and ¢, is contained in the 
cyclic group generated by ¢. On the other hand, since d = g.c.d {n,m}, we 
can find integers a and b such that an + bm = d. Dividing by nm, we get a/m + 
b/n = 1/l. Hence Z, belongs to the subgroup generated by Z,, and @. | 


Lemma 6. [Q(Z,, Z,,): Q(Z,,)] = [Q(Z,): Q(Z)] = dele’). 
Proof: (Q(Z,): Q@(f,)] = [Q(Z,):Q]/[Q(Z,):0] = el(dd'e')/ eld) = 


gle')e(a'd)/o(d) = d'v(e’) by Lemma 4 and noting that e’ and d’d are relatively 
prime. 
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On the other hand, Lemma 5 ensures that it is sufficient to show that [Q(Z,): 
Q(Z,,)] = dole’). Now, since 1=me, we have o(1)/e(m) = o(me)/e(m) = 
gp(md'e')/e(m) = o(md')e(e')/ e(m) = dele’) again by Lemma 4 and noting that 
md’ and e’ are relatively prime. a 

Now to the proof of Theorem 1. 

In general, the minimal polynomial of £, over Q(Z,,) is a factor of the minimal 
polynomial of ¢, over Q(Z,). However, we know that the degrees of these two 
polynomials are the same because of Lemma 6. Thus, we can consider only the 
minimal polynomial of £, over Q(Z,) in the sequel. Since this polynomial divides 
the g.c.d. of both 

®(X)= [I] (X-G) and x°-G= TT (*- 47"), 
g.c.d. {i, n}=1 O<j<e-1 
and the degree of this minimal polynomial is dye’), proving Theorem 1 is 
equivalent to showing that the set S:= {1 +jd:0<j<e-—1, gcd. {n,1 + jd} 
= 1} consists of d’e(e') elements. 
Write S as 


S= VU {1t+(te'+k)d:0<k<e'—1,g.c.d. {1 + (te +k)d,n} = 1}. 
O<t<d'-1 

By Lemma 3, we have (Z/nZ)* = (Z/d'dZ)* x(Z/e'Z)*. Thus 1 + (te’ + k)d is 
invertible in Z/nZ if and only if it is invertible in both Z/d'dZ and Z/e’Z. Since 
1 + (te' + k)d is invertible in Z/dZ, it is automatically invertible in Z/d'dZ by 
part (ii) of Lemma 4. Thus g.c.d. {1 + (te’ + k)d,n} = 1 if and only if g.c.d. 
{1 + kd, e'} = 1, and therefore S can be rewritten as S = Upc, <q_,(te'd + T), 
where T:= {1+ kd:0<k <e’ —1, g.c.d. {1 + kd, e’} = I}. 

From the definition of JT, we may interpret the elements in T as those invertible 
elements in Z/e’Z that are of the form 1+kd for some k because g.c.d. 
{d, e’} = 1 and we find that Z/e’'Z = {1 + kd: 0 <k <e' — 1}(mod e’). Note that 
for any invertible element s in Z/e'Z, there is a unique integer k(mod e’) such 
that s = 1 + kd (mod e’) because we may solve the equation dX = s — 1 (mod e’) 
uniquely. It follows that T = (Z/e’Z)* under such an interpretation. Hence T has 
precisely g(e’) elements, and therefore S consists of d'p(e') elements. a 

Finally, an exercise for the reader: Use Theorem 1 to find the minimal 
polynomial of ¢ over the maximal real subfield of Q(Z,), ie., the field 


Q(cos(27r/m)). 
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1. INTRODUCTION AND HISTORICAL REMARKS. The topic I propose to 
discuss is a concern of the mathematician. However, it is not a topic of mathemat- 
ics but of philosophy, at least if one agrees that philosophy is not itself a 
specialized science but a discipline that deals with the interaction of all human 
endeavors. Although I am a working mathematician with not more than an 
amateur’s knowledge of philosophy, I nevertheless hope to be able to make at least 
some valid observations that will contribute to a better understanding of a rather 
complex situation. 

Mathematics begins with an understanding of the abstract concept of a natural 
number (i.e., of the numbers 1, 2, 3,—ad inf.) and the ability to count indefinitely. 
Today, this understanding is practically universal and, in this sense, we may say 
that every human being is a mathematician. It is a curious fact that the mathe- 
matical component in the emergence of civilization is hardly ever mentioned by 
modern historians. I have found a reference to it only in Jacob Burckhardt. It was 
different in antiquity. In one of his plays, Aeschylus mentions “a&piOuov, EEoxov 
copuowatov” (Number, outstanding (concept) among the ingenious inventions). 
And Aristotle says that everything was created by God with the exception of the 
concept of number, which is man’s invention. 

Mentioning the name of Aristotle could be the starting point for a survey of the 
role of mathematics and of mathematical concepts as an object of philosophical 
investigations, including a history of epistemology. Being a mathematician and not 
a philosopher, I neither can nor will discuss these things. However, I should like to 
touch at least briefly on the work of three eminent philosophers who assigned to 
mathematics an extraordinary role in their systems. They are Plato, Leibnitz, and 
Spinoza. 

Plato considers knowledge of mathematics to be a prerequisite of citizenship. 
Specifically, he states that anyone who calls himself a civilized person should know 
that there exist incommensurable quantities in geometry. For example, it is 
impossible to find a unit of length such that both the side and the diagonal of a 
Square are integral (= whole) multiples of this unit. This is indeed a surprising 
fact. It requires a sophisticated proof and it is something beyond the range of 
intuitive perception. But why should everybody know it? Plato wanted everybody 
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to know that some facts, even surprising ones, are absolute certainties. To 
understand this need for certainty, one should read the plays of Aristophanes, 
which exhibit the emergence of nihilism in Plato’s time. If we use Nietzsche’s 
definition of nihilism as the doctrine: ‘Nothing is true. Everything is permitted”, 
we find it fully illustrated in “The Clouds”. In another play, “The Birds”, we see 
the human race entering into an alliance with the birds in order to destroy the 
power of the gods. It should be remembered that these plays were performed in 
honor of a particular god, Dionysus. In still another play, “The Frogs”, this very 
god receives a thrashing. 

Plato tried to fight nihilism by exhibiting mathematics as a source of absolute 
truth and certainty. Today, we know that the truths and certainties of mathematics 
are relative rather than absolute, so we are not in a good position to fight nihilism 
in this way. 

Leibnitz was both a philosopher and an eminent mathematician. No one ever 
thought more highly of mathematics than he. According to Leibnitz, mathematics 
is the science that tells us what is possible. As far as the physical world is 
concerned, 1e., that aspect of the world that Descartes called “res extensa’’, this 
statement contains at least some truth. But Leibnitz goes further. According to 
him, God, the supreme mathematician, created our particular world by choosing of 
all possible worlds the one with the greatest plenitude and variety. In this sense, 
ours is “the best of all possible worlds.” 

The success of the exact sciences (which are based on the use of mathematics) 
has increased the range of our knowledge of the universe to a degree enormously 
beyond that available to Leibnitz. Paradoxically, this has made many of us 
(including myself) more modest, because our more extensive knowledge has made 
us more aware of the range of our ignorance. We are more reluctant than Leibnitz 
to make definite statements about the universe, and we certainly would not make a 
statement like that of Descartes who said: “Give me matter and motion, and I 
shall make the world once more.” 

Like Leibnitz, Descartes was both a philosopher and an eminent mathemati- 
cian. But mathematics does not play an explicit role in his philosophy although he 
is extremely important for the history of the exact sciences through his dichotomy 
of the world into “res extensa” and “res cogitans.” We still are influenced by his 
perception of the world. But at least one philosopher made a heroic attempt to 
overcome this dualism, and thought he could do so by using not mathematics 
proper, but at least the methods of mathematics. 

Spinoza proposed to derive definite philosophical truths from self-evident 
statements ‘““more geometrico”,i.e., in the manner of Euclid. Although there can 
be no doubt that Spinoza provided deep and important insights, this is not due to 
his method, which does not qualify as a mathematical argument. I shall illustrate 
with an example taken from the first page of his main work, the “Ethics”. There we 
find the statement “By God, I understand Being absolutely infinite.” What is 
“absolutely” infinite? Spinoza did not know of the discovery of Georg Cantor 
(published in 1895) according to which there are smaller and larger infinitudes. For 
instance, there are more points in a finite interval on a straight line than there are 
natural numbers 1,2,3... (ad inf). What is worse, there is an infinite sequence of 
infinitudes, each larger than the previous one. And assuming that there exists a 
largest infinitude containing all the previous ones would lead to a contradiction. 
Now we may be able to Jive with a contradiction, but we cannot tolerate it in a 
mathematical argument. 
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The lesson from this is simple enough. Before we start relying on mathematics 
we have to understand both its potential and its limitations. 


2. WHAT IS MATHEMATICS? There exists a book by Courant and Robbins with 
this title. It tries to answer the question by giving examples. I shall try to give a 
general description first and then illustrate it with a few examples. Please note: 
The following remarks must not be taken for an attempt at giving an epistemologi- 
cal definition of mathematics. Their purpose is merely to provide an intuitive 
understanding of the nature of mathematics. 

Mathematics deals with concepts subject to the rules of logic, in particular to the 
postulate of the excluded middle. There exists at least one set of concepts of this type, 
namely that of the natural numbers. 


Comments. It is not true that all statements involve concepts that are subject to 
logic. If we have a green piece of cloth with a bluish tint we may be in doubt 
whether we should call the color green or blue-green, and we may even disagree 
about the name we wish to give to the color. Similarly, we cannot say that a person 
is either tall or not tall. Even if we give an artificial definition of tallness (say, 6 
feet or more) we may run into trouble because no measurement is absolutely 
precise. (There is a good reason why we have something like 250000 laws in this 
country. The law uses strangely defined concepts and has to be more and more 
casuistic to make them fit reality.) 

As far as I know, Nietzsche was the first to point out this fact. He claimed that 
only man-made concepts are subject to logic. With due respect for Aristotle I 
would take sides against him and Nietzsche and say with Kronecker (a 19th-cen- 
tury mathematician) that “God made the whole numbers. All the rest is the work 
of man.” I hope that these remarks will suffice. I am not prepared to make 
statements about the “reality” of the natural numbers in philosophical (ontologi- 
cal) terms. 

Mathematical research has two important and, I believe, unique characteristics: It 
involves an element of the infinite—being the only secular human activity to do 
so—and it produces an increasing wealth of problems with increasing abstraction. 


Comments and examples. The element of the infinite in mathematics can be used 
to prove—in this case truly “more geometrico’—that the human mind is superior 
to any conceivable electronic computer. I cannot describe the arguments needed 
here without becoming rather technical. They are forever linked with the name of 
one of the greatest mathematicians of our time, Kurt Gédel (1906-1978). In an age 
where scientists as well as philosophers try to tell us that we are really nothing 
particular—a survival mechanism for our genes or, at best, a freakish and rather 
unpleasant animal that, after all, is just capable of doing some things a little better 
than the more pleasant chimpanzee—our mathematical abilities provide perhaps 
the simplest and strongest non-metaphysical argument for our special position in 
nature. 

To illustrate these remarks I shall use two examples. The first one is a theorem 
of number theory, which can be stated as follows: 

Every natural number N is the sum of the squares of at most four natural numbers. 
Unless N + 1 ts divisible by 8, at most three squares suffice. 

Obviously, no amount of direct calculations can prove this theorem because it 
involves an infinitude of numbers. The proof is neither easy nor obvious and was 
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given (for the first part of the theorem) in the late eighteenth century by Lagrange. 
It clearly illustrates what I mean by “an element of the infinite.” I shall need this 
theorem again later when discussing the motivation of the mathematician. But for 
now I need another example to illustrate my remarks about abstraction, and for 
this purpose I shall start with the Koenigsberg Bridge Problem: 

In 1735, the great Swiss mathematician, Euler, came across a peculiar problem 
and described it as follows: “In the town of Koenigsberg there is an island called 
Kneiphof, with two branches of the river Pregel flowing around it. There are seven 
bridges crossing the two branches. The question is whether a person can plan a 
walk in such a way that he will cross each of these bridges once but not more than 
once... . On the basis of the above I formulated the following very general 
problem for myself: Given any configuration of the river and the branches into 
which it may divide, as well as any number of bridges, to determine whether or not 
it is possible to cross each bridge exactly once.” 

Figure 1 shows the layout of the seven bridges of Koenigsberg. Can one stroll 
across each of these bridges once but no more than once? If so, how? 


Figure 1 


We begin to study this problem by throwing away unnecessary information. 
Since the shape and size of the islands and the countryside on the banks of the 
river do not matter at all, we contract each of the four areas labeled respectively 
A, B, C, and D to a single point. The width and shape of the bridges do not 
matter, either, so we replace each of them by a segment of a line or by a curve. 
Figure 2 shows the result of this process. This looks rather similar to a part of the 
subway maps exhibited in the trains in New York City, and we could rephrase the 
problem accordingly with stations and subway rides between them. However, we 


A 


Figure 2 
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prefer to use the standard mathematical terminology. We shall call the whole 
figure a graph. The points A, B, C, D, are called vertices, and the connecting lines 
are called edges. Furthermore, we shall call the number of edges going through 
a vertex the order of the vertex. Obviously, the orders of A, B, C, D, are, 
respectively, 3, 5, 3, 3. Our problem now is: Can we find a path (i.e., a succession 
of edges, each having exactly one point in common with the previous one) such 
that this path goes through all the edges exactly once? Such a path is called an 
Euler path. 

Any graph for which this question can be answered in the affirmative will be 
called an Euler graph. (This is an ad hoc notation and not common usage). Now we 
can show easily that our graph is not an Euler graph. The reason for this is the 
following theorem: 

If a connected graph is an Euler graph then there are either no vertices of odd order 
or exactly two vertices of odd order. 

To prove this, let us imagine that we erase any part of the path through which 
we have traveled in our attempt to travel through all edges exactly once. Whenever 
we enter a point and then leave it again, we will have to erase two edges through 
this point, reducing thereby its order by two. Therefore, if a point has odd order, 
then we may start our path at it, possibly pass through it several times, but then we 
cannot come back to it in the end, because its order would have to be even for this 
purpose. Therefore, if there is a point of odd order, we may try to start our path at 
it, but then we must end our journey at another point of odd order. (What is true 
for the starting point is also true for the terminal point, since we can reverse our 
journey). This proves our theorem, because all points other than the starting point 
and the terminal point must be of even order. 

This is not a deep or difficult theorem, but Euler asked immediately the 
questions that every mathematician would ask in this situation: Is the converse of 
our theorem true? In other words: Suppose we have a connected graph in which all 
points are of even order. Can we start an Euler path anywhere and get back, in the 
end, to our starting point? And suppose we have a connected graph with exactly 
two points of odd order? Can we start an Euler path at one of the points of odd 
order and terminate it at the other one? 

So far, we have established a first level of abstraction. It has led from a “finite” 
or “concrete” problem (the original Bridge Problem) to a general problem, and to 
a general theorem that covers an infinitude of “concrete” problems, namely, all 
possible graphs that we can draw. Now comes a second level of abstraction. We do 
not have to draw anything to define a graph. All we need is an “incidence table”, 
which lists the points and number of edges joining any two of them. The incidence 
table for Figure 2 would then look as follows: The numbers in the first row simply 
mean that A is connected with no edge to itself, with two edges to B, with no edge 
to C and with one edge to D. The numbers in the other rows are similarly defined. 

I hope that you will agree that incidence tables are more abstract than the 
graphs. But now abstraction raises immediately a new problem. We can design 
incidence tables without giving a graph. 

Question: when does an incidence table define a graph that can be drawn in the 
plane so that no two edges cross each other? This is not always the case. The 
problem was solved in the 20th century by Kuratowski. 

The reason why I brought up these things is this: In general, abstraction reduces 
the number of statements we can make and the number of questions we can ask. 
We can say more about one species of birds than about birds in general and more 
about birds than about the animal kingdom, etc. That the situation is different, at 
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least in many cases, for mathematics is a curious and, as far as I know, unique 
occurrence. 


3. WHAT CAN MATHEMATICS DO? The functions of mathematics may be 
described as an extension of some of the functions of language. Before I start 
explaining this I have to demolish a statement that I have learned or read 
frequently as a sort of platitude, namely: “Mathematics is a language.’ Now even 
platitudes can be true, but this one happens to be sheer nonsense. It is true that 
mathematics uses a special language, and the reason is that our everyday language 
uses concepts that are not subject to logic and therefore is not suitable for the 
formulation of mathematical arguments and results. But a loom is not a piece of 
cloth, and technical terms are not ideas. 

In Genesis 2, God gives the privilege of naming all living beings to Adam, who 
represents the human tace. To name things—and even non-material ones like 
feelings and sensations—is an act of abstraction. It has been pointed out by Hans 
Jonas that it is also an act of image-making, another important and specifically 
human privilege, a sort of secondary creativity. (The ability to make images has 
been used by Jonas for the purpose of determining man’s “specific difference”’ in 
the animal kingdom.) The ability to name things is the basis of our ability to plan 
and provides us with a tremendous increase of our ability to communicate, which is 
essential for the emergence of a coherent human society. 

Now I want to show that mathematics, too, has the function of image making 
and that this function gives us the ability to predict. I have to be rather sketchy 
here, in order to avoid technicalities as far as possible. 

Surprisingly, mathematics provides us with abstract images of things that are not 
accessible to the direct perception of our senses. At the same time, the image can 
be made so precise or faithful that it allows us to know all aspects of the original 
that are of importance to us. However, I can give here only an example, which still 
is not really abstract. I hope it will give at least an idea of what all of this is about. 

Let me start with the fact that we can draw a map of the globe on a flat piece of 
paper in such a way that the map can be used for navigation. This is an important 
achievement of mathematics. There are many ways to do this. One of the first was 
found by Gerhard Kremer, who is known under the latinized form of his name 
Gerhardus Mercator (1512-1594). The problem is a difficult one since it is 
impossible to make such a map without distorting distances. 

This is still a rather elementary example of the image-making power of mathe- 
matics. A much more sophisticated and much more important example is the 
mathematical image of an atom with a nucleus and electrons. This mathematical 
image consists entirely of formulas. But these formulas permit us, at least under 
certain circumstances, to make predictions about the behavior of the atom. This is 
an enormous achievement, and it is but one example of the role of mathematics in 
physics, chemistry, and the branches of technology based on these sciences. 

Here I have to put in a word of warning. We are always tempted to overestimate 
the power at our disposal. In the case of language, or, specifically, of “naming”’’, the 
overestimation of our power appeared in the form of incantations or conjuring. 
(Mathematical objects, in particular the pentagram, have been abused for the same 
purpose.) In the case of mathematics, it appears in the more subtle form of 
applying mathematical deductions to situations where this is not justified. In both 
cases there is involved an element of cheating, trying to get something too easily. 
However, I cannot try here to describe the misuses of mathematics. It is a difficult 
topic, and it requires careful study. But I shall discuss briefly anothef aspect of the 
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use of mathematics in physics and other exact sciences that was nicely formulated 
by Eugene Wigner as the title of a talk some years ago: “The unreasonable 
effectiveness of mathematics in the exact sciences.” 

Indeed, it has been a cause of astonishment for a long time that mathematics 
can be used at all to understand and even control the physical world. I believe that 
this astonishment is somewhat misplaced and that it is a consequence of Descartes’ 
philosophy, which divides the world into “res extensa” and “res cogitans’’, and I 
should like to counter it with an aphorism by Lichtenberg (an 18th century 
physicist and essayist) who said: “I have found people who were astonished that 
cats have holes in their furs exactly at the places where the eyes are.” 

Obviously, the cat would not exist if it were otherwise. Similarly, we need order 
for our existence. It is therefore not so surprising that the world contains an 
element of order and that we have an organ to deal with it. Of course, this does 
not explain the extent of our ability to apply mathematics, nor does it explain the 
fact that the human race developed mathematics long before it became useful. So 
there is indeed a reason for some astonishment, which, however, should include 
other phenomena, such as our ability to appreciate and to create beauty and to do 
many other things that, at least originally, provided no visible help in the preserva- 
tion of our species. I shall have to say more about these things in the next section. 
For now, I should like to mention a negative service that mathematics can perform. 
Mathematics can tell us that there are things we cannot do with the means at our 
disposal. For example, suppose we wish to seat the representatives, one for each, 
of the ca. 150 members of the United Nations at a conference table. We can not 
list the possible seating arrangements since their number would be greater than 
the number of electrons and protons in the known Universe. Of course, we are not 
particularly interested in such seating arrangements. But we might be interested in 
the arrangements of genetic material in chromosomes. I have no exact data at 
hand to discuss this problem, but the numbers are large, too. 


4. THE PHENOMENON OF MATHEMATICS. We shall use the term “mathe- 
matics” in its strict sense: The systematic derivation of theorems with the help of 
explicitly formulated arguments. Some mathematical insights are intuitively clear, 
e.g., that a diameter divides a circle into two equal parts; Thales (ca. 624-548 B.c.) 
is supposed to have proved this. The fact that the side and diagonal of a square are 
incommensurable is not at all intuitively clear. Its discovery is ascribed to the 
Pythagorean school. A well-formulated proof of this and of related theorems 
appeared at the time of Plato and was due to his friend Theaetetos. Although 
Babylonian, Indian, and Chinese scholars developed a large body of mathematical 
knowledge, it is safe to say that mathematics in the strict sense is a creation of the 
Greeks. This does not mean the Athenians. With the exception of Theaetetos, 
none of the great Greek mathematicians lived in Athens. Euclid lived in Alexan- 
dria in Egypt. So did Apollonius. Archimedes lived in Syracuse in Sicily. Nothing 
like the systematic works of Euclid and Apollonius is known from other civiliza- 
tions of the same or of earlier times. 

What motivated these mathematicians? Not technology, not even astronomy, 
which, after all, was in its more sophisticated aspects not a “practical” matter at 
all. It is true that Archimedes developed technological applications of mathemat- 
ics, in particular an instrument to compute the position of the planets. But the 
Romans who certainly needed and used high level technology never contributed 
anything to mathematics. In fact, the systematic use of mathematics for the 
development of technology (excluding astronomy) starts only in the 18th century. 
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The case for the development of mathematics was not usefulness. Earlier, we 
compared some functions of mathematics to some functions of language. The 
analogy goes even further. Language, too, is not merely an instrument of power or 
of usefulness. Nor is poetry. As far as mathematics is concerned, a good summary 
of its role appeared in an editorial (by Chandler and Edwards) in The Mathematical 
Intelligencer: 


It is a perennial problem for mathematicians to explain to the public at large 
what makes mathematics worthwhile if not its practicality. It is like explaining 
to someone who has never heard music what a lovely melody is... . Do let us 
try to teach the general public more of the sort of mathematics that they can 
use in everyday life, but let us not allow them to think—and certainly, let us 
not slip into thinking—that this is an essential quality of mathematics. 

There is a great cultural tradition to be preserved and enhanced. Each 
generation must learn the tradition anew. Let us take care not to educate a 
generation that will be deaf to the melodies that are the substance of our 
great mathematical culture. 


In the past, some poets understood the beauty of mathematics. I already 
mentioned Aeschylus. Calderon speaks of ‘‘sublime mathematics” and Schiller calls 
it “divine”. There are certainly more examples of this type, but they seem to 
become rare if not extinct in modern times. The reason for this is, of course, 
increasing inaccessibility of mathematics. Our latest products are available only to 
very few people. But the columns by Martin Gardner and occasional essays on 
mathematics in the Scientific American show that a much larger part of the 
population understands what mathematics is about. Some people are fascinated by 
Lagrange’s theorem (mentioned earlier), but certainly not everybody is. However, 
little would be left of human civilization if we restrict it only to things that enjoy 
universal appreciation. 

There is one more aspect of mathematics that, although well known, usually is 
mentioned as a mere curiosity. I believe it is more than that since it relates to the 
important idea of evolution. What are its uses? Is it a matter of pure chance or is it 
a response to a need, a change of conditions and environment or both? Consider 
the development of the exact sciences and its result, the human ability to dominate 
nature. In several cases, scientists found the mathematical tools they needed 
ready-made and available, sometimes formulated centuries earlier. The philoso- 
pher Whitehead mentions the conics, which had been thoroughly investigated by 
Apollonius in the third century B.c. and were available to Kepler in the 17th 
century A.D. The most surprising example I know of is the theory of probability. 
First of all, it is strange that even a situation of complete disorder, that of random 
events, should be subject to mathematical laws. Secondly, what provoked the study 
of probability was an almost universally despised human activity, namely, gambling. 
But one of the main contributors to the theory of probability was Pascal, who gave 
up mathematics because he thought that the only truly important thing in life was 
to work for the salvation of one’s soul. And finally it turned out that the laws of 
probability are an essential ingredient of the laws of nature. This insight started in 
the 19th century with Boltzmann and culminated in our century with the develop- 
ment of quantum theory. Einstein could never overcome his intuitive objections 
against this development. He said: “God does not play dice.” Niels Bohr’s answer 
to that was: “We cannot tell God how to shape the universe”’. 
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Coming back to the causes of evolution: We can never refute those who say that 
everything is due to pure chance. At best, we may be able to embarrass them. But 
it is hard to see here a response to a need. The theory had been developed long 
before it was needed. 


5. WHAT MAKES A MATHEMATICIAN? There exists a widespread resentment 
against mathematics. It is supposed to deal only with quantity (not true, since most 
of mathematics deals with structure and relations), or with computing (again not 
true, but I cannot explain that in a few words) and, on the whole, it is more worthy 
of a machine than of a human being. As an aid to science and technology, it does 
not provide values and is therefore dehumanizing. Even the claim of the mathe- 
matician to be concerned with truth is frequently answered by saying that mathe- 
matical statements are not true but merely correct. Nevertheless, it is undoubtedly 
true that the results of mathematics are found by human beings. Can anything be 
said about them? 

The answer is: Not enough to enable us to recognize a mathematician if we 
meet one at a party. Nevertheless, there exist properties without which a mathe- 
matician cannot exist. One of them is, of course, a specific talent. But this is far 
from being enough. It must be supplemented by an interest in the matter, in fact 
by a fascination with the problems of the field. And the talent must be supported 
by persistence and by the willingness to spend the large amounts of time and 
energy needed to master a difficult craft. And the mathematician needs an 
exceptionally great ability to stand up under frustration. This is due to the fact 
(pointed out to me by a colleague) that ours is the only field with an all-or-nothing 
alternative. A painting or a piece of furniture may be more or less perfect. A 
theorem and a proof are either true or false. If either the proof or the theorem is 
false, we have absolutely nothing. Finally, we must be satisfied with the production 
of something intangible. I have found housepainting to be a gratifying supplement 
to mathematical research. At least one can see and touch what one has done. 

It follows that the mathematician needs the support of a civilization that 
acknowledges as valuable the products of theory, of pure thought. Although we do 
not set a scale of values, we would not exist without such a scale. I can be brief 
here, since the arguments given by the philosopher Cantore for the humanistic 
significance of science apply, with small modifications, to mathematics as well. 

Let me conclude by pointing out one advantage that the mathematician (and, 
with him, the representative of the exact sciences) has. Our thoughts are eminently 
communicable. Not, perhaps, from person to person. But certainly from nation to 
nation. Mathematicians understand each other no matter where they come from. 
Even across many centuries we understand each other. We may not see clearly 
what a particular expression in Euclid means. But we are confident that, could we 
talk with him, we would be able to clear up the matter quickly. Nothing is more 
international than the community of mathematicians. 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, M. J. Pelling, Richard Pfiefer, Leonard Smiley, John Henry 
Steelman, Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before August 31, 1997; Additional information, such as generalizations and refer- 


ences, 1s welcome, The problem number and the solver’s name and address should 
appear on cach solution. An acknowledgement will be sent only if a mailing label 
1s provided. An asterisk (*) after the number of a problem or a part of a problem 
indicates that no solution ts currently available. 


PROBLEMS 


10578. Proposed by Herbert S. Wilf, University of Pennsylvania, Philadelphia, PA. Consider 
the sequence y2, y3,... defined by the recurrence relation 


(n + 1)(n — 2)yn41 = n(n? —n — VN) yn — (2 — 1)? Yn 
and initial conditions yz = y3 = 1. Show that y, is an integer if and only if n is prime. 


10579. Proposed by Daniel Goffinet, St.-Etienne, France. Call a function f : R > R 
affinely even if, for some a € R, f(a+x) = f(a — x) forevery x ER. 

(a) Is every function F : R — R the sum of two affinely even functions? 

(b) Is every continuous function F : R — R the sum of two continuous affinely even 
functions? 


10580. Proposed by Stephen C. Locke, Florida Atlantic University, Boca Raton, FL. Let G 
bea simple graph with v vertices and e edges and with maximum degree at most 3. Suppose 
that no component of G is a complete graph on 4 vertices. Prove that G contains a bipartite 
subgraph H with at least e — v/3 edges. 


10581. Proposed by Stephen Herschkorn, RUTCOR, New Brunswick, New Jersey. Let X be 
a nonnegative random variable such that the event {X > k} has positive probability for every 
real number k. Consider the collection of nonnegative integer-valued random variables N 
that are independent of X. Must there exist such an N for which E(x’) is finite for every 
real number x but E(X¥) is infinite? 


10582. Proposed by Peter Lindqvist and Kristian Seip, Norwegian University of Science 
and Technology, Trondheim, Norway. Let 4(n) denote the Mobius function and ¢(s) denote 
the Riemann zeta function. Prove that 
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when s > 1. 
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10583. Proposed by Jacob Lurie, Bethesda, MD. Let U be a nonempty bounded open 
set in R”. For any two points p and g on the boundary of U, suppose there is an affine 
transformation sending U to itself carrying p tog. Show that there is an affine transformation 
that carries U to the unit ball. 


10584. Proposed by Charles Conley, Oklahoma State University, Stillwater, OK. Let 
f:R — Rbeacontinuous function such that, for alld € R, the function fy: R — R defined 
by fu(x) = f(x+d) — f(x) 1s infinitely differentiable. Is f itself infinitely differentiable? 


SOLUTIONS 


Rational Approximations with Odd Denominators 


10242 [1992, 675]. Proposed by S. Brocco, Brandeis University, Waltham, MA, and F- 
Mignosi, Institut Blaise Pascal, Paris, France and Universita di Palermo, Palermo, Italy. 
Let a be a fixed irrational number. 

(a) For fixed integer n with n > 1, show that it is possible to find a constant c(n) such that 
there are infinitely many rationals p/gq with gq relatively prime ton and |w— p/q| < c(n)/q?. 
(b) If the continued fraction of w has unbounded partial quotients and € > 0 is given, can 
one find c(n) < € satisfying the above condition? 


Solution by the editors, based on solutions by the proposers and by the late Raphael M. 
Robinson. As in G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 
fifth ed., Oxford, 1979, we denote the continued fraction for a by [ao, a1,...], with the 
convergents Pm/dm = [do, @1,..., 4m]. The denominators of the convergents satisfy 


do = 1, qi =a, and dm = GmQm—1 + Gm-2. (*) 


It follows from (*) and a similar recurrence for the pm, that Pm@m—1 —QmPm—-1 = (— 1y"-! ; 
so that 

Pm _ Pm=i _ (“1 

dm dm—-| dm4m-| 


This leads to estimates on |a — Pm/qdm|. In particular, it is known that every convergent 
satisfies |~a — Pm/GQm| < 1/q2,; while if jw — p/q| < 1/(2q7), then p/¢ = Pm/dm for 
some m. 

Consider first the case where n is prime. Then (*) shows that gcd(qm, Qm-—1) = 1 for 
m = 1,2,...,s80 atleast one of the pair {¢m, @m—1} must be relatively prime ton. Infinitely 
many convergents thus satisfy the required property, so we may take c(n) = 1 whenever n 
is prime. 

In general, one cannot expect that infinitely many qg, are relatively prime to a composite 
n. For example, taken = 6,a9 = 0, a; = 2, a2 = 1, with 2|a2,¢_) and 3|a2, fork > 1. Then 
(*) shows that 2|q2,—-; and 3|q2% for all k > 1. In particular, this gives a negative answer 
to (b), since the am may grow rapidly, while c(6) > 1/2 since gcd(q, 6) = 1 requires that 
p/q is not a convergent. 

We now show that we may take c(n) = (n+ 1)(n +2) in (a). Let p = bpm + Pm-1 and 
gq = bam + Gm- for a positive integer b. Then bqm <q < (b+ 1)qm, and 

2 
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Let b equal the product of those primes that divide n but not q,—1. Then b < n, and since 
each prime factor of n divides precisely one of either bqm or gm—1, it cannot divide q. Thus 
q is relatively prime to n. 
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A negative answer to (b) for any given n can be found by choosing a with am =n if m 
is odd, while the a, — oo if m is even. Then (*) shows that gy, = 0 (mod n) if m is odd 
and qm = 1 (mod n) if m is even. However, for even m, |a — Ppm/qdm| > 1/( (n+ 2)q?, ). 


Editorial comment. Much more precise information relating the approximation of a by the 
convergents Py, /qm to the a» can be found in T. W. Cusick and M. Flahive, The Markoff and 
Lagrange Spectra, American Mathematical Society, Providence, 1989. From this point of 
view, one should fix n and consider how the quantity here called c(n) depends on a. Some 
special cases are known. For example, the case n = 2 was studied in Raphael M. Robinson, 
The approximation of irrational numbers by fractions with odd or even term, Duke Math. J. 
7 (1940), 354-359. There it is shown that jw — p/q| < 1/(2q’) always has infinitely many 
solutions with g odd, and the numbers «@ for which this is best possible are characterized. 


Two Klein Bottles and a Torus 


10359 [1994, 76]. Proposed by Raphael M. Robinson, University of California, Berkeley, 
CA. Two pairs of sides of the unit square 0 < x < 1,0 < y < 1 are identified in such a 
way that the surface obtained has a locally Euclidean metric. How many such surfaces are 
there that are inequivalent as metric spaces? 


Solution by the proposer. There are three such manifolds. The identifications must be made 
in such a way that all four vertices are identified, in order to have a Euclidean metric near 
a vertex. There are three essentially different ways of doing this, which lead to a torus T 
and Klein bottles K and L. The following identifications are to be made for 0 < t < 1: 
T identifies (t, 0) with (ft, 1) and (0, t) with (1, t); K identifies (t, 0) with (¢, 1) and (0, r) 
with (1, 1 — ¢); and L identifies (t, 0) with (1, t) and (0, t) with (¢, 1). 

The torus T is topologically different from K and L, since T is orientable and K and 
L are not. Although K and L are topologically equivalent, they are inequivalent as metric 
spaces. One difference is that shortest segments in K whose endpoints coincide have length 
1, whereas in L there are such segments of length 2/2, for example the one joining (1/2, 0) 
to (1, 1/2). 

Remark. Let K (a, b) be the manifold obtained by starting with the rectangle 0 < x < a, 
O < y < band identifying (t, 0) with (t, b) for 0 < t < a, and (0, t) with (a, b — t) for 
O<t<b. Then K = K(1, 1), and it can be shown that L is equivalent to K(/2/2, J/2) 
as a metric space. All of the manifolds K (a, b) are topologically equivalent, but no two of 
them are equivalent as metric spaces. 


A False Leaky Tent 


10367 [1994, 176]. Proposed by Donald R. Chalice, Western Washington University, 
Bellingham, WA. Let C be the Cantor set in [0, 1], and let E be the set of endpoints of 
the removed intervals (together with 0 and 1). Let F = C — E and let p be the point 
(1/2, 1/2). For any c € C, let L, be the line segment from p to c, let Q, be the points on 
L. with rational ordinates and J, the points of L, with irrational ordinates. The set 


r(Ue)+(u4) 


ecE 


has the property that T is connected, but T — p is totally disconnected. Consider instead 


m= (Ye)o(Ue) 


obtained by interchanging the roles of points with rational and irrational ordinates. Is To 
connected? 
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Solution by Kenneth Schilling, University of Michigan, Flint, MI. No, it is not. It suffices 
to construct a continuous function ¢ from C to the open interval (0, 1/2) such that @(c) is 
rational for c € E and irrational for c € F; then the sets 


ToN{@,y) Ee Le: y<d(c)ceC} and ToN{G,y)e Le: y>d¢(c),cEeC} 


are disjoint, open relative to To, and cover To, and so Tg is not connected. 

To construct such a @, say c = .c)c2¢3c4... € C, expressed in ternary digits using only 
the digits 0 and 2. Put d(c) = y = .Oy2y3y4..., where y, = 1 if cy = Cp—1, and yy iS, say, 
the kth digit of 2 if cy, is the kth digit of c that differs from its predecessor. The idea here is 
that if c € E and so the ternary digit of c is eventually constant, then so is the representation 
of f(c), so d(c) is rational. If c € F and so the representation of c is not eventually constant, 
then the representation of ¢(c) is not eventually repeating, so #(c) is irrational. It is easy 
to check that ¢ is continuous. 


Editorial comment. All solvers used an argument similar to that of the selected solution. 
John Cobb noted that the middle thirds set implied by the phrase “the Cantor set” is only 
one of a huge family of homeomorphic sets. He constructs a set C in this family such that: 
(1) every endpoint is rational, (2) every non-endpoint is irrational, and (3) 1/2 ¢ C. If To is 
constructed from this set using (1/2, 1/2), as in the problem statement, then every vertical 
line x = a where a is rational and not in C separates To: 

K. P. Hart gave two solutions. One resembled the selected solution, while the other 
showed that Tp is zero-dimensional; that is, it has a base for its topology that consists of 
closed-and-open sets. The sets J, U{(1/2, 1/2)} fore ¢ EandV, ={(x,y) € To: y=q} 
form a countable family of closed zero-dimensional subspaces of Tg whose union is Tp. The 
conclusion then follows from the Countable Closed Sum Theorem of Dimension Theory 
(see R. Engelking, Dimension Theory, North-Holland, 1978, Theorem 1.3.1). 


Solved also by J. Cobb, R. Griffus, K. P. Hart (The Netherlands), O. P. Lossers (The Netherlands), M. D. Meyerson, 
A. W. Schurle, and the proposer. 


Reverse Dynamics of a Polynomial Map on the Line 


10369 [1994, 273]. Proposed by Bjorn Poonen (student), University of California, Berkeley, 
CA. Let f(x) be a polynomial having rational coefficients and degree at least 2. Suppose 
that a, a2, a3, ... 18 a sequence of rational numbers such that f (@y,+41) = ap for all n > 1. 
Prove that there exists k > 1 such that a,44 = a, for all n > 1. 


Solution by Emre Alkan (student), Bosphorus University, Istanbul, Turkey. We first show 
that {a,} is bounded. Clearly lim,_,o | f(x)|/|x| = 00, so there exists a constant M such 
that M > |a,| and | f(x)| = |x| whenever |x| > M. If |a,| > M, then |a,_1| = | f(a,)| = 
lan| > M. Repeating this yields |a;| > M, acontradiction. Hence |a,| < M for all n. 

We next construct an integer N such that each Na, is an integer. Express f by f(x) = 
(bax? + bg_)x4-! +-+-+b9)/c, where c, bo, bi, ..., bg are integers. Let r, s be integers 
such that aj = r/s. Let N = sbg. Clearly Na, = rbg is an integer. If Nay, is an integer, 
then (cN4 /bg)(f (x /N) — an) is a monic polynomial with integer coefficients that vanishes 
at Nay+1. By the Rational Zeros Theorem (G. H. Hardy and E. M. Wright, An Introduction 
to the Theory of Numbers, fifth ed., Oxford, 1979, Theorem 45, section 4.3, p. 41), Nan+1 
must also be an integer. By induction, Na, is an integer for all n. 

The set {a,} is bounded, and its elements are multiples of 1/N. The set is therefore 
finite; let m be its size. Consider segments consisting of m + 1 consecutive elements of the 
sequence. Each such segment has one of m™+! patterns, and each pattern has a repeated 
element. Since there are infinitely many segments and finitely many patterns, some pattern 
appears infinitely often. 
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This yields a repetition—with some fixed displacement k—that occurs infinitely many 
times in the sequence. In particular, for each n > 1 there exists j > n such that aj = aj+ . 
Repeatedly applying f to both sides of this equation yields ay, = an+,, as desired. 


Solved also by R. Barbara (Lebanon), P. Budney, R. Holzsager, R. B. Israel (Canada), J. H. Lindsey II, O. P. Lossers (The 
Netherlands), M. Reid, R. M. Robinson, G. L. Stanek, J. Sturm, A. N. ’t Woord (The Netherlands), and the proposer. 


Hamiltonian Paths Through Isomorphism Classes of Graphs 


10370 [1994, 273]. Proposed by Bernardo Recamdn Santos, Waterford Kamhlaba United 
World College of South Africa, Mbabane, Swaziland. Let G,, be the undirected graph whose 
vertices are the unlabeled graphs on n vertices (e.g. G4 has 11 vertices), two of which are 
adjacent in G,, if and only if one can be obtained from the other by deleting an edge. 

(a) Show that neither G4 nor Gs contains a Hamiltonian path. 

(b)* Does G, contain a Hamiltonian path for n > 5? 


Solution by Frank Schmidt, Arlington, VA. 
(a) The graph G4 is shown in Figure 10370. 


Figure 10370 


Suppose G4 has a Hamiltonian path. Since G4 has 11 vertices, the path contains 10 
edges. The endpoints of the path must be the vertices of degree 1, and hence the path must 
contain all edges incident to vertices of degree at most 2. The ten such edges in G4 (shown 
in heavier lines in Figure 10370) do not form a path, so G4 has no Hamiltonian path. It 
follows from the partial solution of part (b) that G5 does not have a Hamiltonian path. 

(b) (Partial solution) If nm > 5 and n is congruent to 0 or 1 modulo 4, then G,, has no 
Hamiltonian path. A Hamiltonian path in G, must alternate between graphs of even size 
and graphs of odd size. This requires |e(n) — o(n)| < 1, where e(n) and o(n) denote the 
number of isomorphism classes of n-vertex graphs with an even or an odd number of edges, 
respectively. In the solution of MONTHLY Problem 10285 [1993, 185; 1996, 268], it was 
noted that e(n) — o(n) = s(n), where s(n) is the number of isomorphism classes of self- 
complementary graphs with n vertices. This contradicts the known result that s(n) > 2 
when n > 5 and n is congruent to 0 or 1 modulo 4 (see M. Kropar and R. Reed, On 
the construction of self-complementary graphs on 12 nodes, J. Graph Theory, 3 (1979), 
111-125). 


Editorial comment. When n is congruent to 2 or 3 modulo 4, there are no self-complementary 
n-vertex graphs, and the partite sets of G, are equinumerous. This may permit a Hamiltonian 
path. In particular, G2 and G3 are paths. It would be of interest to study this question for 
Ge6, which has 156 vertices and 572 edges. 


A self-contained proof of the above results was given by A. N. ’t Woord (The Netherlands); S. C. Locke solved part (a) 
and the cases n=8 and n=9 of part (b); while G. Laman (The Netherlands) and the proposer solved only part (a). 


Maximizing a Ratio of Areas 


10371 [1994, 274]. Proposed by Emil Yankov Stoyanov, Antiem I Mathematical School, 
Vidin, Bulgaria. Let B’ and C’ be points on the sides AB and AC, respectively, of a given 
triangle ABC, and let P be a point on the segment B’C’. Determine the maximum value of 
min {[B PB’), [C PC’)} 
[ABC] 
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where [F] denotes the area of F. 


Solution by Victor Pambuccian, Arizona State University West, Phoenix, AZ. This is a 
problem of ordered affine geometry, since it asks for the maximum value of 


[BPB’] [CPC’] 
| [ABC] ’ [ABC] ; 
and ratios of areas of triangles are affine notions (see Léonce Lesieur, Sur la mesure des 
triangles en géométrie affine, Math. Z. 93 (1966), 334-344). 
Whenever X’ is a point on the side Y Z of a triangle XY Z, we have [XX'Y]/[XYZ] = 
|Y X’|/|Y Z|. Using this, we get 


[BBP] |B’P| [BB’c’]  |BB’| [ABC’] _ |AC’| 


[BB'C’] |B'C’|’ [ABC] |BA|’ [ABC] |AC|’ 
Hence 
[BB'P]  |B’P| |BB’| |AC’| 


=i. 1 
[ABC] |B’C’| |BA|  |AC| () 


and analogously 
[cc’P] __|c’P| |cc'| |AB’| Q) 
[ABC]  |C’B’| |CA| |AB|’ 


Multiplying the right sides of (1) and (2) and grouping the factors conveniently, we get 
|B'P| |c'P| |BB’| |AB’| |Ac’| |cc’| 
|B’C’| |C’B’| |BA| |AB| |AC| |CA| ]} 
The two factors inside each pair of parentheses add up to 1, hence their product is at most 
1/4, with equality if and only if the respective terms are equal. Therefore the whole product 
is at most (1/4)? = (1/8)?, with equality if and only if B’, C’, and P are midpoints of AB, 
AC, and B’C’, respectively. 
Since the product of [B P B’]/[A BC] and [C PC’]/[ABC] is always at most (1/8), the 


smaller of the two is at most 1/8, with equality if and only if B’, C’, and P are midpoints 
of AB, AC, and B’C’, respectively. Hence the required maximum is 1/8. 


Solved also by J. Anglesio (France), R. Barbara (Lebanon), K. L. Bernstein, R. J. Chapman (U. K.), J.-P. Grivaux 
(France), N. Komanda, J. H. Lindsey II, O. P. Lossers (The Netherlands), C. G. Petalas (Greece), R. Reynolds, D. Tang, 
A. N. ’t Woord (The Netherlands), NSA Problems Group, and the proposer. 


A Recurrence with a Harmonic Solution 


10375 [1994, 362]. Proposed by John Brillhart and J. S. Lomont, University of Arizona, 
Tucson, AZ. Find the complete solution of the recurrence 


Un+2 = 2(2n + 3)*Un41 — 4(n + 1)°Qn + 1)Qn+3)Un (20). 
Solution by Kee-Wai Lau, Hong Kong. We show that forn > 1, 
Un = (2n)!( Uo + (Ui /2 — Up)(A + 1/2 + 1/3 +--+ +1/n)). 
The substitution V, = U,,/(2n)! yields 
(n + 2)Vn42 = (2n + 3)Vn41 — (n+ 1)Vn 


for n > 0. This is equivalent to (n +2)(Vn42— Vn41) = (2 +1)(Vn+1 — Vn), which implies 
that (n + 2)(V,42 — Vn+1) = Vi — Vo. Rewriting this as Vz42 = Vn41+(Vi — Vo)/(n +2) 
yields the solution 


Vn = Vot (Vi — Vo)(1 + 1/2+--++ 1/n), 


1997] PROBLEMS AND SOLUTIONS 275 


which translates into the desired formula. 


Editorial comment. Several solvers began by observing that U, = (2n)! is a solution. The 
second solution can then be found by reduction of order. In particular, Marko PetkovSek 
claimed to find that solution “by inspection” although he indicated in a footnote to his 
solution that the inspection was done by a computer. See M. PetkovSek, Hypergeometric 
solutions of linear recurrences with polynomial coefficients, J. Symb. Comp. 14 (1992), 
243-264 for more information about this approach. 

Solved also by J. Alvarez (Spain), J. Anglesio (France), R. Bagby, H. J Barten, K. L. Bernstein, J. C. Binz (Switzerland, 
two solutions), G. A. Bookhout, M. Burger (Austria), R. J. Chapman (U. K.), C. K. Cook, P. Deiermann, D. Doster, 
S. B. Ekhad, C. Georghiou (Greece), J.-P. Grivaux (France), C. R. Hampton, M. S. Klamkin (Canada), A. M. Krall, 
J. Laforgue, K.-W. Lau (Hong Kong), G. Letac (France), J. H. van Lint (The Netherlands), S. C. Locke, O. P. Lossers 
(The Netherlands), G. Loudner, C. Mallinger (Austria), L. E. Mattics, J. Ottenstein (Israel), A. Pedersen (Denmark), 
M. PetkovSek (Slovenia), R. C. Read (Canada), R. Richberg (Germany), N. C. Singer, M. Vowe (Switzerland), H. Widmer 
(Switzerland), Anchorage Math Solutions Group, NSA Problems Group, and the proposers. 


A Diophantine Polynomial Equation 


10376 [1994, 362]. Proposed by Nobuhisa Abe, Oita, Japan. Determine all integer solutions 
of 
x(x + 1)(x + 2)(x +304 +4)(x +5) = yy? - 1. 


Solution by Paul T. Bateman, University of Illinois, Urbana, IL. Aside from the trivial 
solutions given by x = —5, —4, —3, —2, —1, 0 and y = +1, weclaim that the only integer 
solutions of the given equation are (x, y) = (—7, £71) or (2, £71). 

Let f(x) = x(x +1)(4+2)(4+3)(% +4)(x +5) + 1. We seek those integer values of 
x such that f(x) is a square or, equivalently, 64 f(x) is a square. Since f(—x) = f(x —5), 
we may restrict attention to positive integer values of x. For large positive x we have the 
following expansion in decreasing powers of x: 


(64 f (x)) 


1/2 1/2 


= 8x9((14+ x7!) + 2x7 + 3x7) t 4a! + 5x7!) + x78) 
= 8x3(1+ 15x7! + 85x? 4.225273 + O(@ 4) 
= 8x? + 60x? + 115x + 75/2 + O(x7!). 
Hence for large positive x we have 

(8x? + 60x? + 115x + 37)? < 64f (x) < (8x? + 60x? + 115x + 38)?. (x) 
A brief calculation gives 

(8x? + 60x? + 115x + 38)? — 64f (x) = 8x? + 249x7 + 1060x + 1380 
and 

64 f (x) — (8x? + 60x? + 115x + 37)* = 8x? — 129x? — 830x — 1305. 
Hence the right-hand inequality in (+) holds for all positive x and the left-hand inequality 
in (*) holds for x > 22. Thus, if x is any integer greater than 21, (*) shows that 64 f(x) 
lies strictly between two consecutive squares and so is not a square. On the other hand, it 
is easy to verify (using a pocket calculator, say) that f(x) is not a square for x = 1 and for 
x =3,4,5,...,21. Thus our claim is established. 


This method goes back to C. Runge, Uber ganzzahlige Lésungen von Gleichungen zwis- 
chen zwei Veranderlichen, J. Reine Angew. Math. 100 (1887), 425-435. 


Editorial comment. L. E. Mattics made a systematic study of the equation with y? — k in 
place of y* — 1 on the right side, and found that the positive solutions with |k| < 31 are: 
(3, 142) for k = 4, (1, 27) for k = 9, and (21, 12875) for k = 25. He proved also that for 
positive solutions, 1 < x < max{27, (40 |k| + 1)!/3). 
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Solved also by Y. Alemu (Ethiopia), J. Anglesio (France), D. Caccia, M. J. Cohen, H. G. Killingbergtrg (Norway), 
N. Komanda, J. H. Lindsey II, S. C. Locke & A. D. Meyerowitz, O. P. Lossers (The Netherlands), L. E. Mattics, M. Reid, 
J. P. Robertson, R. M. Robinson, F, Schmidt, N. C. Singer, A. N. ’t Woord (The Netherlands), Anchorage Math Solutions 
Group, and the proposer. 


Hermitian Matrices 


10377 [1994, 362]. Proposed by Kathryn R. Laberteaux (student), University of Michigan, 
Ann Arbor, MI. On the final exam in a linear algebra class, I was asked to express the 
statement “A is Hermitian” in the form of a matrix identity. I should have written “A = A*”, 
but out of haste and exhaustion I wrote “AA* = A2” instead. Was my answer correct? 


Solution I by Marvin Marcus, Santa Barbara, CA. Yes. By the Schur Triangularization 
Theorem, we may write A = U*TU, where U is unitary and T is upper triangular with 
diagonal A,,..., A, being the eigenvalues of A. Then AA* = A becomes TT* = T?. 
Now tr(TT*) = tr(T*) yields SR_y lAgl? + Dy<icjen Wtijl? = ea Ag. The triangle 
inequality implies that ¢;; = 0 for all i, 7. We then have [Ag |? = re for all k, and hence 
each A, is real. In other words, A is unitarily similar to a diagonal matrix with real entries 
and thus is Hermitian. 


Solution II by G. P. Shannon, University of Ulster, Coleraine, N. Ireland. To show that 
A — A* = 0, it suffices to show that tr((A — A*)(A — A*)*) = 0. We first compute 
(A—A*)(A—A*)* = (A—A*)(A*—A) = (AA*—A”)+(A* A—A*?), Using the linearity of 
the trace function, the property tr(AA*) = tr(A* A), and the condition AA* = A* = (A*)”, 
we obtain tr ((A — A*)(A — A*)*) = tr(A*A) — tr(A**) + tr(AA*) — tr(A”) = 0, and so 
A is Hermitian. 


Solution III by Eugene A. Herman, Grinnell College, Grinnell, IA. Yes, the equations A* = A 
and AA* = AA are equivalent. Multiplying the first by A yields the second. To derive the 
first from the second, we use that the null space of A and the range of A* are orthogonal 
complements with respect to complex inner product. Thus A* = A follows by proving 
that A*v = Av for all v € N(A) and all v € R(A*). If v € N(A), then Av = 0, and 
we compute [|A*v||* = (A*v, A*v) = (v, AA*v) = (v, AAv) = (v,0) = 0. This yields 
A*v = 0, and hence A*v = Av. When v = A*w € R(A%*), we uSe the complex conjugate 
AA* = A*A* of AA* = AA to obtain A*v = A*A*w = AA*w = Av, which completes 
the proof. 


Solution IV by Robert B. Israel, University of British Columbia, Vancouver B. C., Canada. 
Assume that AA* = A*%. We first prove by induction that A” = (A*)” for n > 2. 
Conjugating the given equation yields this forn = 2. If A” = (A*)” forn > 2, then 
Att! _ A(A*)" _— (AA*)(A*)"—! _— (A*)2(A*)"7! _— (A*yttl, 

Let p be the monic minimal polynomial of A. Every eigenvalue of A* = AA* is a 
nonnegative real number, so every eigenvalue of A is real. Therefore p has real coefficients 
and p(A*) = (p(A))* = 0. Thus p is also the minimal polynomial of A”*. 

If (A*)?v = 0, that is AA*v = 0, then A*v = 0. Thus ¢ occurs at most once as a 
factor of p(t), and the linear coefficient a, or constant coefficient ap in p(t) is nonzero. If 
ag = 0, then 0 = p(A) — p(A*) = a,(A — A*) with aj #0, so A = A*. If ag ¥ O, then 
0 = Ap(A) — A* p(A*) = ao(A — A”) and again A = A”*. 


Solution V by Gérard Letac, Université Paul Sabatier, Toulouse, France. Consider the Her- 
mitian matrix B = i(A—A*). Thehypothesis AA* = A? yields AB = 0. If B has anonzero 
eigenvalue A with eigenvector x, then0 = ABx = AAx. Thus 0 = Ax = A*x — idx, and 
O = x*Ax = x*A*x — idAx*x. Since x* Ax = x*A*x, we obtain A = 0, a contradiction. 
Since the eigenvalues of the Hermitian matrix B are zero, B = 0. 
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Solved also by A. Bandopadhyay (India), S. K. Berberian, R. Bielawski (Canada), P. Budney, D. Callan, R. J. Chapman 
(U. K.), M.-D. Choi (Canada) & C.-K. Li & W. So, D. Choudhury, L. M. DeAlba, G. Ehrlich, D. Fasino (Italy), P. M. 
Gibson, S. Goldberg, J.-P. Grivaux (France), J. L. Hartman, A. L. Holshouser & B. G. Klein, R. A, Horn, N. Komanda, 
D. W. Koster, K. Kubo (Japan), C. Lanski, J. H. Lindsey II, O. P. Lossers (The Netherlands), W. Margulies, T. L. 
Markham & A. R. Schep, J. K Merikoski & A. Virtanen (Finland), J. M. Monier (France), J. Ratz (Switzerland), 
R. Richberg (Germany), F. Richman, F. Schmidt, A. Singh (India), G. Trenkler (Germany), E. I. Verriest (France), A. N. 
*t Woord (The Netherlands), D. J. Wright, P. Y. Wu (China), F. Zhang, M. Zhang (China), the New Mexico Tech Problem 
Solving Group, and the proposer. 


From Integers to Integers, But Not Very Many 


10382 [1994, 473]. Proposed by Richard K. Guy, University of Calgary, Calgary, Alberta, 
Canada. Which integers are represented by (x + y + z)*/(xyz) where x, y, and z are 
positive integers? 


Solution by Rolf Richberg, RWTH Aachen, Aachen, Germany. The numbers so represented 
are {1, 2, 3, 4, 5, 6, 8, 9}. 
Let F(x, y,z) = (x +y+z)*/(xyz). Fixn and suppose n = F(x, y, z), withx < y <z 
and z minimal for that choice of n. From 
nxyz=(xt+y+z) =t+y)?+2xt+yzt+27, 
we infer that z|(x + y)*. Ifz > x+y, then (x + y)*/z < z and 


2 
( (x + x) aoe (x+y+2z)? (x + y+2z)* 
F{ x, y, —_ J FEF eee See EEN 
Zz xy) XYZ 


Thus the minimality of z implies that x + y > z. Now 


Eee Chee kee 
n=—+—-+—-4+-4+-4+-8 -4+-7t+(-4+7- — 
YZ exXZ xy x yy ZZ x \y x x 
This implies that z = 1 (and n = 9) or that z > 2 (andn < 8). Thusn < 9. 

We next prove that n 4 7. The inequality 7 < 7/x + 3/z prohibits x > 2. With x = 1, 
x+y > zyields y < z < y+1. Whenz = y we have (1 + 2y)* = 7y?, and when 
z= y+1 we have (2+ 2y)* = 7y(y + 1), neither of which has an integer solution. 

Finally, F(9,9,9) = 1, F(4, 4, 8) = 2, F(@3, 3,3) = 3, FQ, 2,4) =4, Fd, 4,5) =5, 
F(1, 2,3) = 6, FC, 1, 2) = 8, and F(1, 1,1) = 9. 

Solved also by J. Anglesio (France), M. Barr (Canada), B. Battsengel (Mongolia), R. J. Chapman (U. K.), J. Christopher, 


M. Farris, D. Khaykis, P. A. Kumar & V. Kannan (India), G. N. Lewis, J. H. Lindsey I, L. E. Mattics, J. McHugh, R. E. 
Prather, J. P. Robertson & J. S. Robertson, R. M. Robinson, N.C. Singer, D. West, NSA Problems Group, and the proposer. 


Spaces with Closed Fixed-Point Sets 


10385 [1994, 474]. Proposed by Nandor Sieben, Arizona State University, Tempe, AZ. Let 
X be a topological space. It is easy to see that if X is a Hausdorff space, then fixed-point sets 
are closed. That is, for any continuous function f: xX — X, theset F = {x eX: f(x) = x} 
is closed. Is the converse true? More precisely, if X has the property that all fixed-point 
sets are closed, must X be a Hausdorff space? 


Solution by Paul R. Meyer, Lehman College, CUNY, Bronx, NY. Let Tax denote the stated 
property. One can readily verify that 7, =» Ts, => T) and that the cofinite topology 
on an infinite set shows that the second implication is not reversible. 

To give a negative answer to the question in this problem by showing that the first 
implication is not reversible,we use an example from Helen F. Cullen, Unique sequential 
limits, Boll. Un. Mat. Ital. (3) 20 (1965), 123-124. Let X = RU{p} where p ¢ R. A 
set not containing p is open if and only if it is open in the usual topology of R, and a set 
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containing p is open if and only if its complement is the union of finitely many convergent 
sequences in R together with their limits. Clearly X is not T> (the set [0, 1] is compact but 
not closed). To show that X is Tx, let f be a continuous self-map of X and show that its 
set of fixed points F is closed. There are two cases. 

Case 1. f(p) = p. To show that F is closed, assume that x is an accumulation point of 
F with x 4 p. Then x € R and there is a sequence (x,) of elements of F converging to x. 
Since f(xy) = Xn, it follows from the continuity of f that f(x) = x. Thus, x € F and F 
is closed. 

Case 2. f(p) # p. Wecomplete the proof by showing that f is aconstant function. Say 
f(p) = 0 for definiteness. Then X — f~!(—1/n, 1/n) is accountable set, call it A,. Then 
LJ An = X — f7—!(0) is also countable, so that f~!(0) is dense in R by the Baire category 
theorem. But f—!(0) is also closed in X. It follows that f is constant. Thus F = {0}, 
which 1s closed. 


Editorial comment. Joachim Schréder supplied a reference to Vera Trnkova, Categorical 
aspects are useful for topology, General Topology and its Relations to Modern Analysis and 
Algebra (Lecture Notes in Mathematics 609), Springer, 1977, where it is shown that every 
T, space V can be embedded as a closed subspace of a T; space X such that all continuous 
self-maps of X are either constant or the identity. 


Solved also by B. Burdick, J. Cobb, R. Griffus & K. Smith, A. Kumar & P. Chatterjee (India), R. P. Millspaugh, J. Schroder 
(South Africa), J. R. Smith, and the proposer. 


Accumulated Preferences 


10397 [1994, 681]. Proposed by Sam Northshield, SUNY, Plattsburgh, NY, and José 
Luis Palacios, Universidad Simon Bolivar, Caracas, Venezuela. Let X be an N-valued 
random variable. Show that if Pr(X =k|X =k or X =k +1) is non-decreasing, then 
Pr(X =k|X < k) is non-increasing. 


Solution by Richard Holzsager, The American University, Washington, DC. Write px = 
Pr(X =k). The hypothesis implies that if py, > 0, then pg4; > 0. It won’t hurt to assume 
that the first k for which pg > 0 is k = O (it’s just a matter of renumbering, and saves the 
trouble of interpreting Pr(X = 0|X < 0) when both events are impossible). 

The hypothesis py_1/(pe—-1 + Pk) < Pe/(Pk + P41) is equivalent to pe_-1/PRE < 
Pk/Pk+1. By transitivity, pi/Pi+1 S Pn/Pnti, OF Pi/Pn S piti/Pn+1 for any i <n, so 


Po+++++ Pn < Pi ++ Pati ec Pot +++ + Pati 
Pn 7 Pat! Pn+l 
Inverting, we get the desired result, strengthened to say that the probabilities in question are 


actually strictly decreasing. 


Solved also by A. Adler, S. V. Amari & R. B. Misra (India), C. Anderson, M. H. Andreoli, N. Bouzar, D. D. Brics 
(Denmark) & D. Ranjan, A. E. Caicedo Niijfiez (Colombia), D. Callan, R. J. Chapman (U. K.), M. P. Cohen, D. A. 
Darling, R. Ehrenborg (Canada), N. Grivaux (France), V. Hernandez (Spain), E. Hertz, R. D. Hurwitz, R. S. Katti & 
S. Vidyashankara, F. Kemp, G. Keselman, J. Kupka (Australia), J. H. Lindsey II, O. P. Lossers (The Netherlands), 
J. Marengo, D. K. Nester, A. Pedersen (Denmark), G. S. Rogers, D. M. Rosenblum, R. P. Sealy (Canada), N. C. Singer, 
R. Stong, D. B. Tyler, E. I. Verriest (France), E. A. Weinstein, NSA Problems Group, Western Maryland College Problems 
Group, and the proposers. 


A Way to Form Ideals of Power Series 


10399 [1994, 682]. Proposed by Mowaffaq Hajja, Yarmouk University, Irbid, Jordan. Find 
all infinite sequences c = (co, C}, C2, .. .) of integers for which the set 


= {So aix' Ee Z[x]: y- aici = o| 
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is an ideal of Z[x]. 


Composite solution by John H. Lindsey II, Ft. Myers, FL and Nasha Komanda, Central 
Michigan University, Mt. Pleasant, MI. We prove that J, is an ideal if and only if there are 
integers r and s such that c; = r's for all i. 

Necessity. If co = 0, then 1 € J, which requires 1x! € Ig, and thus c; = 0 for all 
i. We choose s = O and r arbitrary. On the other hand, if co 4 0, then c) — cox € Ie, 
which requires (c; — cox)x' € Ig. We conclude that cjc, — cj41co = O, and therefore 
Ci41 = rcj;, where r = c\/co. This yields c; = rico fori > 1. Nowr must be an integer, 
since otherwise c; will not be an integer when i is sufficiently large. We choose s = cg. 

Sufficiency. Suppose that c; = r's for fixed integers r and s. If co = 0, then J. = Z[x]. 
If co £ 0, then f € J, if and only if f(r) = 0. Thus J, is the set of all polynomials in Z[x] 
that vanish at r; this set is an ideal of Z[x]. 
Solved also by R. Barbara (Lebanon), P. Budney, L. Cagliero & J. Lauret (Argentina), A. E. Caicedo Niifiez (Colombia), 
R. M. Carroll, R. J. Chapman (U. K.), R. Ehrenborg (Canada), D. Faurot, S. M. Gagola Jr., R. Holzsager, U. Klein (Ger- 


many), O. P. Lossers (The Netherlands), P. J. Morandi, I. Nemes (Austria), A. Nijenhuis, A. Pedersen (Denmark), 
R. Stong, A. N. ’t Woord (The Netherlands), and the proposer. 


A Sequence of Reducible Polynomials 


10423 [1993, 1014]. Proposed by M. Filaseta & C. Nicol, University of South Carolina, 
Columbia, SC. For a positive integer n, let 


Py, (x) = pi >-1<j <n, gcd(j,n) = 1}. 


For example, P(x) = P2(x) = 1, P3(x) = x4+1, Pax) = x*+1, Ps(x) = xP 4+x7 4x41, 
and P(x) = x* + 1. Prove that P,(x) is reducible over the rationals for every n > 7. 


Composite solution by Roy Barbara, Lebanese University, Fanar, Lebanon, and Michael 
Reid, Brown University, Providence, RI. We prove inductively that for n > 2, P,(x) has a 
factor of the form | + x’ for some r > 0, and it is a nontrivial factor if n > 6. If n is prime, 
then P,,(x) = ar xi =(14tx)(14+x7+x44+.---+2x"73), and the factors are nontrivial 
ifn > 3. The case n = 4 can be checked explicitly. 

When n > 4 is not prime, we write n = mp, where p is prime and m > 2. The induction 
hypothesis yields a factor 1+ x" of P,(x). Let A, = {7 € Z: 1 < j <n, gced(j,n) = 1}. 

If p divides m, then A, is the disjoint union of translates of A,, by multiples of m, and 
P,(x) = Ome 0 | im) Pp, (x). Thus 1 + x” is the desired factor of P,, (x). 

If p does not divide m, then within the universe of positive integers less than n, 


An = {j : gcd(j, m) = 1} — {7 : gcd(J, mp) = p}. 


This consists of translates of Aj, less omissions of the form j = kp with gcd(k, m) = 1. 
Since x*P—! = xP—!(xP)*k-!, we obtain 


p-1 
P(x) = (5 “) Pn (x) — x?! Pa (x?). (x) 


i=0 
If p is odd, then 1+.” isa factor of 1+x?". Since 1+ (x?) divides Pn (x”), equation (*) 
yields 1+ x’ as the desired factor of P, (x). If n is square-free and has no such odd divisor 
p, then n = 2m, where m is an odd prime. Now explicitly P, (x) = Osan 0 I y2/) — xm! — 
(1+x™+t!y1 + x2 +4... + x73). The factors are nontrivial if m > 3. 


Solved also by M. Benedicty, D. Callan, R. J. Chapman (U. K.), N. Komanda, O. P. Lossers (The Netherlands), and 
the proposers. 
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Solutions to a Determinantal Equation 


10427 [1995, 71]. Proposed by George Soules, CCR-IDA, Princeton, NJ. Let A be an 
n-by-n positive semi-definite Hermitian matrix. Write A = L + D+ L*, where L is lower 
triangular with zero diagonal, and D is the diagonal of A (and L* is the complex conjugate 
transpose of L). If det(D) 4 0, show that all n roots of det(zL + zD + L*) = O lie in the 
unit disk |z| < 1. Also, determine when this polynomial can have a root with |z| = 1. 


Solution by Edward A. Bender, University of California, La Jolla, CA. Let B = zL+zD+L* 
and B* = L+zD+42L"*. Upto ascalar, there is only one linear combination aw B + B B* 
for which the coefficients of L and L* are equal; the result is 

(1—2)B+(1—z)B* =(1—22)A —|1—2/°D. (*) 
Given det(B) = 0, choose v # 0 such that Bu = 0. Since v* B* = 0, the quadratic forms 
v* Bu and v* B*v are both zero. From equation (*), we obtain 

(1 — zz)v*Av = |1 — z|*u* Dv. 

Since a;; > O for every i, the right side 1s positive for z 4 1. Since x* Ax is a nonnegative 
quadratic form, we conclude that v* Av > O and zz < 1. If z = 1, then A is singular. 
Solved also by O. Krafft & M. Schaefer (Germany), and J. H. Lindsey I. 


More Intertwined Exponentials 


10475 [1995, 746]. Proposed by Wu Wei Chao, He Nan Normal University, Xin Xiang City, 
He Nan Province, China. For0 <x < y < lor1l <x < y, prove that 

yh [x >y/x (1) 

and yfx>y* /x’. (2) 


Solution by J. S. Frame, Michigan State University, East Lansing, MI. The assumptions on 
x and y may be written asO < x < yand (x —1)(v-—1) > O. 
Taking logarithms, we replace the desired inequalities by 


x*Iny—y' Inx >Iny—Inx, (1)* 
Iny —Inx > xIny— ylnx. (2)* 
Since (x — 1)(y — 1) > 0, inequalities (2) and (2)* are equivalent to 


In x . In y (2)** 


x-1l y-l1 
We prove (2)** by noting that, for z > 0 and z # 1, the positive-valued function f(z) = 
(In z)/(z — 1) has the negative-valued derivative 
In(1—(1—z7!))+ (1-274) 
(z — 1)? ) 
so that f(x) > f(y). This proves (2)**, and hence also (2). 


We now introduce a variable t with 0 < t < 1. From (2), we get (y/x)!/" > y/x > 
y* /x¥ and y/x > y*!/x”', so that 


] 
O< | yx? — xy" dt = 
0 


which is equivalent to inequalities (1)* and (1) since Inx - Iny > 0. 


xy¥-1 y*r-1 


In x Iny ° 


Editorial comment. Joe Howard demonstrated how to modify the published solution of 
MONTHLY Problem E 3291 [1988, 872; 1990, 346] to obtain the slightly stronger inequality 
(1) of the present problem. 


Solved also by R. J. Chapman (U. K.), J. Howard, W. Janous (Austria), P. McCartney, L. Scribani (South Africa), H.-J. 
Seiffert (Germany), F. Qi (China), and the proposer. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


Emblems of Mind: The Inner Life of Music and Mathematics. By Edward 
Rothstein. Times Books, 1995, xx + 263, $25.00 


Reviewed by Jeffrey Nunemacher 


That mathematics and classical music share many fundamental attributes is no 
surprise to anyone acquainted with both fields. Both are highly structural; both can 
be abstract and unworldly; both depend more on natural talent than on human 
experience, so both engender prodigies. But is there a connection between mathe- 
matics and music that lies deeper than these surface similarities? Just such an 
exploration is what Edward Rothstein has attempted in his new book Emblems of 
Mind. The result is a speculative rumination on both fields, with illuminating 
extended examples chosen to illustrate his points. By design, the book is philosoph- 
ical and metaphorical in nature, so a reader needs to be responsive to poetry, 
philosophy, and “high art” to be in tune with the author’s goal. The title, for 
example, is drawn from the description of a quest for unifying understanding in 
Wordsworth’s poem “The Prelude’, and a romantic, somewhat mystical impulse 
pervades the book. 

At the time of publication of the book, Mr. Rothstein was chief music critic for 
The New York Times; he now writes generally about cultural issues for that 
distinguished newspaper. Mr. Rothstein holds a Ph.D. from the Committee on 
Social Thought at the University of Chicago, and this book, which attempts to 
relate his two enthusiasms, derives (at least in spirit) from his study there. He has 
studied mathematics and music at Yale, Brandeis, and Columbia and has a good 
understanding of the nature of both fields. On the other hand, he has never been a 
professional mathematician, so his views of this subject, though they are well- 
informed and rewarding to ponder, differ somewhat in substance and definitely in 
tone from those held by many mathematicians. In particular, the author’s approach 
is heavily philosophical, and I suspect that many mathematical readers maintain an 
impatience and skepticism towards the diffuse goals and perceived “woolliness” of 
philosophy. 

The book is organized into six chapters. The first and the last (entitled 
“Prelude” and “Chorale’’) introduce and recapitulate his main themes. The second 
(“Partita”) explores topics in mathematics with a view to musical analogies; the 
third (“Sonata”) does the same for music with mathematical analogies. The fourth 
(“Theme and Variations”) and fifth (‘Fugue’) address the issues of beauty and 
truth in both areas. Though the author claims that he writes for a literate general 
reader, the book is most likely to appeal to someone who has had some serious 
involvement with both fields. I think it is unlikely that a reader who has never 
studied music theory would find much of what is said about music very compelling. 
There is also a progression of ideas in the book from the concrete and definite to 
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the abstract and mystical. Mr. Rothstein finds the real essence and inner life of 
both mathematics and music to be the reflection of Platonic “Forms”, which he 
discusses in the final chapter. 

The mathematics treated in the book will generally not be new to a mathemati- 
cally trained reader. There is some discussion of the axiomatic method and its 
limitations, of some problems in elementary number theory, of topology and the 
problem of homeomorphism, and of both the standard and nonstandard models of 
the continuum. These expositions are nicely done and the general reader can catch 
the flavor of the subject. The author does not make the mistake of inferring from 
the formality of most mathematical exposition that what mathematicians really do 
is to play with axiom systems. It is true, however, that the author’s interests are in 
pure mathematics, which does bias his point of view. He views mathematics as a 
humanity rather than a science; thus the interesting issues are of style, taste, and 
history rather than applicability. I wonder, for instance, whether a physicist or 
applied mathematician would agree with his statement that ‘truth in mathematics 
is often quite different from truth in physics”. There are nice insights about issues 
not usually discussed by mathematicians. For example, in a discussion of the 
Chinese Remainder Theorem, drawn from Davis and Hersh [1], Mr. Rothstein 
gives four quite different formulations appropriate to different historical eras and 
to varying levels of abstraction. He then analyzes the issue of style in mathematics 
and its similarity to musical styles. 

The analysis of beauty in Chapter 4 is devoted mostly to making the case for the 
Significance of beauty within mathematics, since most readers will grant that 
beauty is a driving force behind music. Mr. Rothstein finds beauty in a variety of 
guises and purposes: the esthetics of discovery (Poincaré), of written exposition 
(Maxwell), or oral exposition (his teachers at Yale—Robinson and Kakutani), of 
mathematical argument (Péter and Gauss). This range of examples is fruitful to 
consider and sheds new light on the issue. But when the author moves from 
specific instances to general assertions about his topic, one finds poetic assertions 
such as “The beauty of the true is an inescapable aspect of the inner life of 
mathematics and music” (p. 146). Such statements verge on mysticism rather than 
useful substantive analysis. It is hard to find a definition of truth that fits both 
mathematics and music and is not so vague as to be meaningless. 

There are a few errors in the book, one of which is surprising given the author’s 
general level of mathematical competence. On page 71 the epsilon-delta definition 
of continuity is stated incorrectly (with the most common error that an undergrad- 
uate analysis student would make). On page 75 the impression is given that there 
are two nondegenerate classes of loops on a torus rather than the infinite number 
that lie in the fundamental group. Overall, though, a mathematician will be happy 
with the accuracy and tone of the writing about mathematics. 

The book was successful in getting this reader back to his piano to study Bach’s 
Fugue in De Minor from the Well-Tempered Clavichord, Book 1, which is much 
discussed in Chapter 3 as an example of melody and counterpoint. Actually, it took 
some effort to locate this piece. Eventually I found the composition as Fugue No. 
VIII in the Schirmer (Czerny) edition but with a key signature of E? Minor. What 
can explain the change of key from DH? In the authoritative Bach-Gesellschaft 
edition, the associated prelude is notated in Ep with the fugue in DE: one 
presumes that later editions have amended the notation for consistency. The 
author’s comments about the fugue are much more illuminating in front of a piano. 
Another composition that arises in the text and is considerably easier to play for 
readers who have studied some piano is Chopin’s Prelude in A Minor. This piece is 
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short enough to be reproduced on one page of Mr. Rothstein’s book and is 
explored as an example of disturbing and dissonant beauty. 

How successful is Mr. Rothstein’s attempt to find deep common ground be- 
tween mathematics and music? He looks in promising places, such as the unifying 
concepts of mapping and transformations, which occur in both fields, but he does 
not find much apart from analogy. Ultimately the author locates the object of his 
quest in Platonic philosophy, which is sufficiently indefinite to be unsatisfying. I 
did not become as absorbed in this book as I had expected. While I found 
individual sections of the book insightful and respect the author’s attempt to 
articulate the common underlying principles in the two fields, I am not persuaded 
that there is much here beyond metaphor. For catching the spirit of mathematics 
and examining it from a humanistic perspective, the book by Davis and Hersh is 
more successful. Perhaps readers who are less suspicious of metaphor as an 
analytic device will think differently. I urge you to sample this book for yourself. 


REFERENCE 


1. Philip Davis and Reuben Hersh. The Mathematical Experience. Birkhauser, Boston, 1981. 


Department of Mathematical Sciences 
Ohio Wesleyan University 

Delaware, Ohio 43015 
jlnunema@cc.owu.edu 


A Tour of the Calculus. By David Berlinski. Pantheon Books, 1995, xvii + 332, 
$27.50 


Reviewed by Israel Kleiner 


Yet another calculus book? Yes and no. 

If you are looking for a book in which the formulas, equations, theorems, and 
proofs take center stage, this book will add little. If you want a tour of the subject 
that offers a wider perspective, Berlinski invites you to come aboard. 

Here is the author’s nutshell view of calculus, “the massive load-bearing walls 
and buttresses of the subject”: 


The overall structure of the calculus is simple. The subject is defined by a fantastic leading idea, 
[namely that] the real world may be understood in terms of the real numbers; one basic axiom, 
[which] brings the real numbers into existence; a calm and profound intellectual invention: the 
mathematical function; a deep property: continuity; two central definitions: instantaneous speed 
and the area underneath a curve; one ancillary definition: limit; one major theorem: the mean 
value theorem; and the fundamental theorem of the calculus. 


The twenty-six chapters of this 300-page book flesh out this skeleton in an 
engaging, witty, and mathematically honest manner. 

Berlinski’s outline provides the contours for the logical development of calculus 
rather than for its historical evolution. For example, the real numbers came Jast in 
the historical sequence, and the notion of function was not available to the 
creators of calculus! Berlinski is aware of this: “In its historical development, the 
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calculus represents an exercise in delayed gratification”’, he affirms in his inimitable 
way. But his book is not an historical account of calculus. 

The history of the subject is, however, given consideration in A Tour of the 
Calculus. As it must (it seems to me). For the author deals with central ideas of the 
subject, and history supplies the context and motivation for their rise and flower- 
ing. It also points to the subject as a human enterprise, standing on the shoulders 
of giants. The author’s pronouncements on some of these giants are variously 
endearing (“In his death as in his life, Descartes was unique: he was the only great 
thinker to die of discomfort”), intriguing (‘“[Euler] amused himself with mathemati- 
cal oddities, his intelligence functioning in that strange frictionless world which in 
the history of mathematics is inhabited only by Euler himself and in the history of 
music only by Mozart”), and insightful ([Riemann] was in his temperament 
a geometer, in his affiliations a Platonist, and in his soul a visionary”’). 

Practically all of the prominent mathematicians of Europe around 1650 could 
solve many of the problems in which elementary calculus is now used. Why, then, 
is the subject considered to have been invented (by Newton and Leibniz) in the last 
third of that century? This question helps to focus on some of the central ideas of 
calculus, for example the recognition that tangents and velocities on the one hand, 
and areas and arc-lengths on the other, can be subsumed under general concepts: 
the derivative and the integral, respectively. Following the subject’s invention 
(discovery?), it took another two centuries to provide it with rigorous foundations. 
“Certain intellectual tools may be successfully used before they are successfully 
understood” is how the author puts it. Cause for reflection. 

Among other ideas that Berlinski reflects on are the roles of arithmetic versus 
geometry, the nature of ¥2, the purpose of symbols, and the significance of the 
notions of function, continuity, derivative, and integral. He lays special emphasis 
on continuity—of the line and of a function: 


Continuity is an aspect of things as rooted in reality as the fact that material objects occupy 
space; it is the contrast between the continuous and the discrete that is the great generating 
engine by which the real numbers are constructed and the calculus created. The concept of 
continuity is, likc so many profound concepts, both simple and elusive, elementary and divinely 
enigmatic. 


In the 17th century Leibniz enunciated an all-embracing Principle of Continuity, 
which, in a mathematical context, said roughly that what holds in a given case also 
holds in what appear to be like cases. At one point Leibniz justified the rules of 
operation with infinitesimals by their resemblance to those with real numbers. The 
principle played a significant role in 18th- and 19th-century mathematics. In 
18th-century calculus the modus operandi was that what is true of the finite (e.g., 
polynomials) is true also of the infinite (e.g., power series). In early 19th-century 
algebra the principle of continuity went under the name of the Principle of 
Permanence of Equivalent Forms. It said, essentially, that the laws of operation 
with positive numbers carry over to negative numbers. Poncelet in the early 19th 
century proclaimed a principle of continuity in projective geometry. The basic idea 
in all of these was that what were viewed to be insignificant changes in input had 
no effect on output. 

The concept of continuity of a function, defined in terms of epsilons and deltas, 
came into being in the second half of the 19th century—about 200 years following 
the invention of calculus. Cauchy did define continuity in 1821, but in terms of 
infinitesimals (‘an infinitely small increment of the variable always produces an 
infinitely small increment of the function”). This got him into trouble: he “proved” 
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(for example) that a convergent series of continuous functions is continuous. (To 
be sure, some have argued that it is a false reading of history to view Cauchy’s 
proof as erroneous.) Mathematicians believed, and some “proved”, that continuity 
implies differentiability (except possibly at a finite number of points)—until 
Riemann and Weierstrass jolted the mathematical community with their examples 
of continuous nowhere differentiable functions. And continuity was sometimes 
identified with the Intermediate Value Property (a counterexample was given by 
Darboux in 1870). “Elusive” and “enigmatic”, indeed, that continuity. 

But not just continuity. Even functionality—nowadays elementary and simple 
and commonplace—proved to be elusive and controversial, “purely an intellectual 
object...and one of the great imponderables that like certain movie stars is 
forever familiar but forever unknown”’. 

Recorded mathematical history goes back almost 4000 years. During the first 
3700 of these, mathematicians developed the elements of algebra, deductive 
geometry, trigonometry, analytic geometry, and calculus. Surprisingly, the notion of 
function was not part of that development. The concept originated in the early 
18th century, well into the so-called modern period in the evolution of mathemat- 
ics. Why so late? Principally because there was no manifest need for it earlier. (Is 
there a lesson here for pedagogy?) And secondarily because the algebraic prereq- 
uisites were lacking—the coming to terms with the continuum of real numbers, 
and the development of symbolic notation. 

When the function concept did emerge, it was first as a “formula” (a so-called 
“analytic expression”, although what that meant was not made precise), then as a 
curve, a formula again, an arbitrary correspondence, a formula, a set of ordered 
pairs, a generalized function... . The reception accorded the introduction of new 
functions was not always cordial. For example, Hermite called continuous, 
nowhere-differentiable functions ‘‘a lamentable evil’, and Poincaré termed them 
“monsters”. Such an elementary notion, that of function, yet such a rich and 
tortuous history. 

Hilbert noted that every mathematical theory goes through three stages of 
development—the naive, the formal, and the critical. The same can be said of the 
evolution of mathematical concepts, for example the derivative: from tangent and 
velocity (naive), through differential and fluxion (formal), to limit of a difference 
quotient (critical). The psychological (hence pedagogical?) distance between the 
naive and critical stages is forbidding, according to the author: 


The derivative is an artifact, the first of the great concepts of modern science that fails 
conspicuously to correspond to anything in real life. In order to express speed as a function of 
time, the mathematician is prepared to sacrifice common sense, he is prepared to sacrifice the 
intuitive definition of speed, in plain fact, he is prepared to sacrifice everything. 


Let us set aside discussion of other issues raised in Berlinski’s book and ask: 
What is the book good for? And whom is it good for? Here is the author’s answer: 


I have written this book for men and women who wish to understand the calculus as an 
achievement in human thought. It will not make them mathematicians, but I suspect that what 
they want is simply a little more light shed on a dark subject. 


Calculus as an achievement in human thought? That as a major goal would 
seem to disqualify the book as a text for most (all?) existing calculus courses. Yes, 
calculus is a set of rules or algorithms (a “calculus”); it is also a theory to explain 
why the rules work; and it is applications (of the theory and the rules) to 
fundamental problems in science. But calculus is one of the great intellectual 
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accomplishments of civilization. Ought we perhaps not begin every calculus course 
with some such statement? Of course it is rather difficult to deliver on this 
promise, but Berlinski’s book might help us to start thinking about it. 

A Tour of the Calculus is written in narrative style and is easy reading (for a 
mathematician, that is). There are (in a dozen brief appendices attached to 
relevant chapters) formal definitions of major concepts, and proofs of a number of 
important results. But this is no systematic account even of elementary calculus. 
That, too, would seem to disqualify it as a text. We would all agree, however, that 
among the multitude of definitions and theorems taught in a calculus course some 
are more important than others. We ought to bear this insight in mind and pass it 
on to students. Berlinski’s book might help us focus our thoughts on the issue. 

A Tour of the Calculus is one in a genre of recently published (or reprinted) 
“popular” books. Among my favorites are E. T. Bell, Mathematics: Queen and 
Servant of Science, R. Courant and H. Robbins, What is Mathematics?, T. Dantzig, 
Number: The Language of Science, E. Kasner and J. R. Newman, Mathematics and 
the Imagination, V. M. Tikhomirov, Stories about Maxima and Minima, O. Toeplitz, 
Calculus: A Genetic Approach, and N. Y. Vilenkin, In Search of Infinity. Some of 
my criteria for selecting them are: they focus on mathematical ideas without giving 
short shrift to the mathematics. They incorporate at least some history of the 
subject with which they deal (and occasionally a touch of philosophy). They are 
well written and can be heartily recommended to students. And they make 
us—teachers—think. 

I know I am setting myself up for severe reprimand for (a) omitting many 
deserving books, and (b) including Bell. I hope I can count on the Monthly editors 
to come to my defense for transgression (a), but I’d like to offer a defense of (b). 

My colleagues in the history of mathematics especially will take me to task for 
including Bell. His books are reputed to contain historical inaccuracies (if not 
worse), and I will not deny that. It has been said unkindly (and I imagine unfairly) 
that Bell would not let the truth stand in the way of a good story. He does write 
beautifully and inspiringly (my opinion, it goes without saying). His Men (ouch) of 
Mathematics and his Development of Mathematics (groan, fellow historians) have 
inspired me (a good many years ago) to take a serious interest in the history of 
mathematics, and for that I am eternally grateful to him. With appropriate 
forewarning, I would not hesitate to recommend Bell’s books to aspiring students 
of the history of mathematics, if only to awaken their interest in the subject. 

But I must in conclusion return to Berlinski’s book. There is, in general, more 
than one way of achieving a set of objectives. I imagine that the book’s readers may 
have their own ideas about how to achieve those stated and pursued by the author 
of A Tour of the Calculus, or they may set other goals for such a book. I believe, 
however, that a reviewer’s function is to judge the book that the author has actually 
written, not one that he or she feels the author should have written. In that spirit, I 
am happy to commend A Tour of the Calculus to students and colleagues 
(especially the latter): it is for the most part pleasurable reading and it may set us 
to thinking (again) about some of the issues in the learning and teaching of 
calculus (with a bow to the various calculus-reform movements). 
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Telegraphic Reviews are designed to alert readers in a timely manner to new books 
appropriate to mathematics teaching and research. Special codes classify reviews by 
subject area and appropriate use: 
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receive a second, more extensive review in the Monthly. 
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General, P, L*. The Parsimonious Universe: 
Shape and Form in the Natural World. Ste- 
fan Hildebrandt, Anthony Tromba. Springer- 
Verlag, 1996, xiii + 330 pp, $32.95. [ISBN 0- 
387-97991-3] A wonderful, visually rich ex- 
ploration of the worldly consequences of opti- 
mization principles (networks, geodesics, sur- 
faces, living forms). Lavishly illustrated with 
historical, scientific, and computer-generated 
images. An expanded version of the au- 
thors’ 1984 Mathematics and Optimal Form 
(TR, August-September 1985; Extended Re- 
view, June-July 1988). LAS 


Reference, P, L*. Encyclopedia of Operations 
Research and Management Science. Eds: Saul 
I. Gass, Carl M. Harris. Kluwer Academic, 
1996, xxxx + 753 pp, $350. [ISBN 0-7923- 
9590-5] 185 expository articles plus descrip- 
tions, definitions, abbreviations, etc. Articles 
are designed to serve as initial sources of in- 
formation. Each includes some background or 
history, describes applications, and lists refer- 
ences. 


Recreational Mathematics, S(13). Challeng- 
ing Problems in Geometry. Alfred S. Posa- 
mentier, Charles T. Salkind. Dover, 1996, ix 
+ 245 pp, $7.95 (P). [ISBN 0-486-69154-3] 
Nearly 200 nonroutine problems in Euclidean 
geometry. Republication of the Second Edi- 
tion published by Dale Seymour Publications 
in 1988. 


Recreational Mathematics, $(13). Challeng- 
ing Problems in Algebra. Alfred S. Posa- 
mentier, Charles T. Salkind. Dover, 1996, x 
+ 262 pp, $8.95 (P). [ISBN 0-486-69148-9] 
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More than 300 problems that require little more 
than knowledge of high school algebra (and 
some ingenuity!). Republication of the Second 
Edition published by Dale Seymour Publica- 
tions in 1988. 


Education, P. Teaching Mathematics: To- 
ward a Sound Alternative. Brent Davis. Gar- 
land Pub, 1996, xxxii + 324 pp, $21.95 (P). 
[ISBN 0-8153-2298-4] A densely end-noted 
philosophical apologia for a new approach to 
mathematics education centered on a theoretical 
framework the author calls “enactivism”—the 
idea that understanding flows not from reason 
or observation but from action. This analysis 
leads to a pedagogy of active listening from 
which reason emerges. LAS 


Education, P, L. Communication in Mathe- 
matics, K-12 and Beyond. Portia C. Elliott, 
Margaret J. Kenney. 1996 Yearbook. NCTM, 
1996, x + 256 pp, $22. [ISBN 0-87353-423-9] 
28 diverse papers by 55 authors on mathemat- 
ical discourse, writing, and language all aimed 
at supporting NCTM’s “communication” stan- 
dard. Zal Usiskin’s concluding chapter summa- 
rizes much of the volume by describing mathe- 
matical language in its various possible forms: 
written, oral, pictorial, foreign, dead, nonsense, 
abstract, and native. LAS 


Education, P. Research in Collegiate Math- 
ematics Education. II. Eds: Jim Kaput, Alan 
H. Schoenfeld, Ed Dubinsky. CBMS Issues in 
Math. Educ., V. 6. AMS, 1996, xi + 217 pp, 
$37 (P). [ISBN 0-8218-0382-4] 10 research 
papers on diverse topics, and a list of questions 
for future work. 
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History, P, L. The Collected Papers of Albert 
Einstein, Volume 4, The Swiss Years: Writings, 
1912-1914. Transl: Anna Beck; Consultant: 
Don Howard. Princeton Univ Pr, 1996, xi + 
314 pp, $39.50 (P). [ISBN 0-691-02610-6] 


History, P. Collected Papers, Volume I. Flo- 
rentin Smarandache. Editura Tempus Romania, 
1995, 301 pp, (P). [ISBN 973-9205-02-X] 


Combinatorics, P*, L**. Handbook of Com- 
binatorics. Eds: R.L. Graham, M. Grotschel, L. 
Lovasz. MIT Pr, 1995, $300 set [0-262-07 169- 
X]. Volume 1, cii+ 1018 pp, $175, [ISBN 0-262- 
07170-3]; Volume 2, cii+ 1177 pp, $175. [ISBN 
0-262-07171-1] A comprehensive overview 
of the present state of combinatorics. Organized 
into five sections: Structures, Aspects, Methods, 
Applications, and Horizons. 


Discrete Mathematics, L. Problems and Ex- 
ercises in Discrete Mathematics. G.P. Gavrilov, 
A.A. Sapozhenko. Texts in Math. Sci., V. 14. 
Kluwer Academic, 1996, xi + 422 pp, $198. 
[ISBN 0-7923-4036-1] Problems with hints 
and solutions on topics such as Boolean algebra, 
graphs, networks, coding theory, algorithm the- 
ory, combinatorics, and logical design. Takes 
a functional approach to discrete mathematics 
(typical of the Moscow school). LC 


Discrete Mathematics, T(13: 1). Introduc- 
tory Discrete Mathematics. V.K. Balakrishnan. 
Dover, 1996, xiv + 236 pp, $9.95 (P). [ISBN 
0-486-69115-2] Republication, with correc- 
tions, of the 1991 Prentice Hall edition. Top- 
ics include basic counting principles, permu- 
tations, combinations, the inclusion-exclusion 
principle, generating functions, algorithm anal- 
ysis, and graph theory. 


Number Theory, S, P, L**. The New Book 
of Prime Number Records. Paulo Ribenboim. 
Springer-Verlag, 1996, xxiv + 541 pp, $59.95. 
[ISBN 0-387-94457-5] A delightful collec- 
tion of just about everything dealing with prime 
numbers. This edition has updated records anda 
few new sections. (Second Edition, TR, Febru- 
ary 1990.) CEC 


Group Theory, P. Representations of Infinite- 
Dimensional Groups. R.S. Ismagilov. Transl. 
of Math. Mono., V. 152. AMS, 1996, x + 
197 pp, $85. [ISBN 0-8218-0418-9] 


Algebra, T**(14—16: 1, 2). Abstract Algebra: 
An Introduction, Second Edition. Thomas W. 
Hungerford. Saunders College, 1997, xix + 
588 pp, $54. [ISBN 0-03-010559-5] Minor 
organizational changes. Keeps the distinctive 
organizational structure: integers, then polyno- 
mials, then groups and rings. (First Edition, 
TR, May 1990.) TH 
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Algebra, T*(17-18: 1), P, L. Field and Ga- 
lois Theory. Patrick Morandi. Grad. Texts 
in Math., V. 167. Springer-Verlag, 1996, xvi 
+ 281 pp, $42.50. [ISBN 0-387-94753-1] <A 
splendid book. Covers separability, Galois the- 
ory, applications, and infinite extensions. Large 
collection of examples and exercises. TH 


Algebra, T(18: 1), P. An Algebraic Introduc- 
tion to Complex Projective Geometry: 1. Com- 
mutative Algebra. Christian Peskine. Stud. in 
Adv. Math., V. 47. Cambridge Univ Pr, 1996, 
x + 230 pp, $39.95. [ISBN 0-521-48072-8] 
The first of a three-part introduction to commu- 
tative algebra developed as a tool for geometry 
and number theory. Topics include: Noetherian 
rings and modules, polynomial rings in several 
variables, Weil and Cartier divisors. TH 


Algebra, P. Period Spaces for p-divisible 
Groups. M. Rapoport, Th. Zink. Annals of 
Math. Stud., No. 141. Princeton Univ Pr, 1996, 
xxi+324 pp, $59.50(P). [ISBN 0-69 1-0278 1-1] 


Complex Analysis, P. Lectures on Entire 
Functions. B. Ya. Levin. Transl. of Math. 
Mono., V. 150. AMS, 1996, xv + 248 pp, $99. 
[ISBN 0-8218-0282-8] 


Differential Equations, T(14: 1). A First 
Course in Differential Equations with Model- 
ing Applications, Sixth Edition. Dennis G. 
Zill. Brooks/Cole, 1997, xiii + 438 pp, $67.95. 
[ISBN 0-534-95574-6] Increased emphasis of 
modeling; color plates with new applications; 
more emphasis on non-linear and systems of 
differential equations. (Fourth Edition, TR, 
February 1989.) LC 


Differential Equations, P. Asymptotic So- 
lutions of the One-Dimensional Schrodinger 
Equation. S. Yu. Slavyanov. Transl. of Math. 
Mono., V. 151. AMS, 1996, xvi + 190 pp, $99. 
[ISBN 0-8218-0563-3] 


Differential Equations, P. Linear Differen- 
tial Operators. Cornelius Lanczos. Classics in 
Appl. Math., V. 18. SIAM, 1996, xvii + 564 pp, 
$49.50 (P). [ISBN 0-8987 1-370-6] 


Dynamical Systems, P. Introduction to the 
Qualitative Theory of Dynamical Systems on 
Surfaces. S. Kh. Aranson, G.R. Belitsky, 
E.V. Zhuzhoma. Transl. of Math. Mono., 
V. 153. AMS, 1996, xiii + 325 pp, $129. [ISBN 
0-82 18-0369-7] 

Functional Analysis, P. Index Theory, Coarse 
Geometry, and Topology of Manifolds. John 
Roe. CBMS Reg. Conf. Ser. in Math., No. 90. 
AMS, 1996, ix + 100 pp, $17 (P). [ISBN 0- 
8218-0413-8] 


Functional Analysis, P. Co-Groups, Commu- 
tator Methods and Spectral Theory of N-Body 
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Hamiltonians. Werner O. Amrein, Anne Boutet 
de Monvel, Vladimir Georgescu. Progress in 
Math., V. 135. Birkhauser Boston, 1996, xiv + 
460 pp, $98. [ISBN 0-8176-5365-1] 


Functional Analysis, P. Partial Differential 
Equations and Functional Analysis: In Mem- 
ory of Pierre Grisvard. Eds: Jean Cea, et al. 
Progress in Nonlinear Diff. Eqns. & Their Ap- 
plic., V. 22. Birkhauser Boston, 1996, xxii + 
263 pp, $134.50. [ISBN 0-8176-3839-3] Pa- 
pers from a 1994 conference together with a 
bibliography of Grisvard’s works and one of his 
previously unpublished papers. 


Analysis, P. Trigonometric Fourier Series and 
Their Conjugates. Levan Zhizhiashvili. Math. 
& Its Applic., V. 372. Kluwer Academic, 1996, 
xii + 300 pp, $165. [ISBN 0-7923-4088-4] 


Differential Geometry, P. Finsler Geometry. 
Eds: David Bao, Shiing-shen Chern, Zhongmin 
Shen. Contemp. Math., V. 196. AMS, 1996, 
Xxili + 310 pp, $61 (P). [ISBN 0-8218-0507-X] 
Proceedings of a 1995 AMS-IMS-SIAM Joint 
Summer Research Conference at the University 
of Washington. 


Differential Geometry, P. Metrics, Connec- 
tions and Gluing Theorems. Clifford Henry 
Taubes. CBMS Reg. Conf. Ser. in Math., 
No. 89. AMS, 1996, v + 90 pp, $15 (P). [ISBN 
0-82 18-0323-9] 

Differential Geometry, P. Riemannian Ge- 
ometry. Takashi Sakai. Transl: Takashi Sakai. 
Transl. of Math. Mono., V. 149. AMS, 1996, 
xiii + 358 pp, $119. [ISBN 0-8218-0284-4] 


Differential Geometry, P. Some Questions of 
Differential Geometry in the Large. Ed: E.V. 
Shikin. AMS Transl., Ser. 2, V. 176. AMS, 
1996, x + 192 pp, $89. [ISBN 0-8218-7506-X] 
6 papers present recent results from Russia and 
the Ukraine. 


Geometry, P, L. Strange Phenomena in Con- 
vex and Discrete Geometry. Chuanming Zong. 
Ed: James J. Dudziak. Universitext. Springer- 
Verlag, 1996, x + 158 pp, $29 (P). [ISBN 
0-387-94734-5] Famous problems in convex 
and discrete geometry that have counterintu- 
itive and/or strange answers. Contains recent 
advances and an excellent bibliography. CEC 


Algebraic Topology, T(16-17: 2). Algebraic 
Topology. C.R.F. Maunder. Dover, 1996, vii 
+ 375 pp, $10.95 (P). [ISBN 0-486-6913 1-4] 
Republication of the 1980 Cambridge Univer- 
sity Press corrected printing. (1970 Van Nos- 
trand Reinhold edition, TR, February 1972; Ex- 
tended Review, April 1973.) 


Topology, P. Renormalization and 3- 
Manifolds which Fiber over the Circle. Cur- 
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tis T. McMullen. Annals of Math. Stud., 
No. 142. Princeton Univ Pr, 1996, vii + 253 pp, 
$24.95 (P); $55. [ISBN 0-691-01 153-2; 0-691- 
01154-0] 


Mathematical Modeling, T?(16-17: 1). El- 
ements of Pattern Theory. Ulf Grenander. 
Johns Hopkins Univ Pr, 1996, xiii + 222 pp, 
$24.95 (P); $65. [ISBN 0-8018-5188-2; 0- 
8018-5187-4] Not about pattern recognition 
algorithms. Uses algebraic and probabilistic 
methods to “formalize the ... concept of pattern 
in terms of a mathematical framework.” Some 
background in algebra, probability theory, and 
some programming experience needed. LB 


Stochastic Processes, P. Schrodinger Diffu- 
sion Processes. Robert Aebi. Prob. & Its Ap- 
plic. Birkhauser Boston, 1996, viii + 186 pp, 
$79.50. [ISBN 0-8176-5386-4] 


Elementary Statistics, T (13-14: 1, 2). 
Statistics: Basic Principles and Applications. 
William J. Adams, Irwin Kabus, Mitchell P. 
Preiss. Kendall/Hunt, 1994, xv + 792 pp, 
$60.95, [ISBN 0-8403-8964-7]; Companion to 
Statistics: Basic Principles and Applications, 
1994, vii + 215 pp, $18.95 (P). [ISBN 0-8403- 
9414-4] Non-calculus-based. Covers basic 
descriptive and inferential statistics, with ad- 
ditional sections on index numbers, time series, 
nonparametrics. ANOVA, contingency tables, 
goodness-of-fit covered briefly in a chapter on 
additional hypothesis tests. Uses hypothetical 
examples rather than real-world data. LB 


Statistical Methods, P. Lecture Notes in Con- 
trol and Information Sciences—216: Recursive 
Nonlinear Estimation: A Geometric Approach. 
Rudolf Kulhavy. Springer-Verlag, 1996, xvi + 
224 pp, $54 (P). [ISBN 3-540-76063-6] 


Statistics, P. Mathematical Theory of Reliabil- 
ity. Richard E. Barlow, Frank Proschan. Clas- 
sics in Appl. Math., V. 17. SIAM, 1996, xv + 
258 pp, $34.50 (P). [ISBN 0-89871-369-2] 


Computer Systems, P. Java Class Reference 
Package. Randy Chapman. Specialized Sys- 
tems Consultants, 1996, $7 (P) set, [ISBN 0- 
916151-96-4]. applet, awt and util Class Ref- 
erence, 20 pp, (P); lang, io and net Class Ref- 
erence, 18 pp, (P). 


Computer Science, C, P, L. Encyclopedia of 
Graphics File Formats, Second Edition. James 
D. Murray, William vanRyper. O’Reilly & As- 
sociates, 1996, xxxvi + 1116 pp, $79.95 (P), 
with CD ROM. [ISBN 1-56592-161-5] 


Computer Science, P. Statistical Software En- 
gineering. National Academy Pr, 1996, x + 
73 pp, $29 (P). [ISBN 0-309-05344-7] A re- 
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port on opportunities for statistical thinking to 
contribute to software engineering. 


Computer Science, P. Comparative Con- 
currency Semantics and Refinement of Ac- 
tions. R.J.H. van Glabbeek. CWI Tract, 
V. 109. Stichting Mathematisch Centrum, 
1996, 285 pp, Dfl. 50 (P). [ISBN 90-6196- 
454-7] 


Computer Science, P. Probabilistic Expert 
Systems. Glenn Shafer. CBMS-—NSF Reg. 
Conf. Ser. in Appl. Math., V. 67. SIAM, 1996, 
viii + 80 pp, $24.50 (P). [ISBN 0-89871-373-0] 


Computer Science, P. Network and Internet 
Security. Vijay Ahuja. Academic Pr, 1996, xix 
+ 324 pp, $219.95 (P). [ISBN 0-12-045595-1] 


Computer Science, P. Getting Connected: The 
Internet at 56K and Up. Kevin Dowd. O’ Reilly 
& Associates, 1996, xii + 410 pp, $29.95 (P). 
[ISBN 1-56592-154-2] 


Applications (Biological Science), P. Math- 
ematics and Physics of Emerging Biomedical 
Imaging. National Academy Pr, 1996, xvii + 
238 pp, $29 (P). [ISBN 0-309-05387-0] Sur- 
veys contributions of the mathematical sciences 
and physics to the current state of dynamic 
biomedical imaging and outlines research op- 
portunities. 


Applications (Biological Science), T(14—15: 
1), L. An Introduction to the Mathematics of Bi- 
ology: With Computer Algebra Models. Edward 
K. Yeargers, Ronald W. Shonkwiler, James V. 
Herod. Birkhauser Boston, 1996, x + 417 pp, 
$64.50. [ISBN 0-8176-3809-1] Mathematics 
behind problems such as aging, genetics, HIV, 
neurophysiology, etc. Biological concepts de- 
veloped as needed. Maple code interspersed. 
Rich source of examples for teachers. Prereq- 
uisite: one year of calculus, some linear algebra; 
statistics introduced in text. LC 


Applications (Economics), T(17—18: 1), P. 
Modeling and Optimization of the Lifetime of 
Technologies. Natali Hritonenko, Yuri Yat- 
senko. Appl. Optim., V. 4. Kluwer Academic, 
1996, xxxvi + 249 pp, $130. [ISBN 0-7923- 
4014-0] Application of modeling, optimiza- 
tion, systems analysis, and control theory to 
economics of replacement and renewal of tech- 
nologies in industry. LB 


Applications (Physics), P. Large-Scale Struc- 
tures in Acoustics and Electromagnetics. Na- 
tional Academy Pr, 1996, x + 252 pp, $29 (P). 
[ISBN 0-309-05337-4] Proceedings of a 1994 
symposium. Papers focus on computational 
methods for determining the dynamics of large- 
scale systems. 
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Applications (Quantum Theory), P.  Con- 
temporary Mathematical Physics: FA. Berezin 
Memorial Volume. Eds: R.L. Dobrushin, et al. 
AMS Transl. Ser. 2, V. 175. AMS, 1996, ix + 
236 pp, $99. [ISBN 0-8218-0426-X] 12 pa- 
pers in group representation theory, supermath- 
ematics, and spectral analysis. Also a brief sci- 
entific biography of Berezin and some personal 
recollections. 


Applications (Quantum Theory), P. Quan- 
tization, Nonlinear Partial Differential Equa- 
tions, and Operator Algebra. Eds: William 
Arveson, Thomas Branson, Irving Segal. Proc. 
of Symp. in Pure Math., V. 59. AMS, 1996, x 
+ 224 pp, $54. [ISBN 0-8218-0381-6} Pro- 
ceedings of the 1994 John von Neumann Sym- 
posium at MIT. 


Applications (Systems Theory), S(15—16), P, 
L*, Would-Be Worlds: How Simulation Is 
Changing the Frontiers of Science. John L. 
Casti. Wiley, 1997, xii + 242 pp, $24.95. 
[ISBN 0-471-12308-0] A preview of the 2 Ist- 
century science of complex systems, which 
is “still awaiting its Newton.” For lack 
of a theory, computer simulation substitutes. 
Such systems—physical, biological, behav- 
10ral, social—feature a modest number of intel- 
ligent, adaptive agents that react to local (rather 
than global) information. Derived from work 
at the Santa Fe Institute; illustrated with di- 
verse models ranging from football to linguis- 
tics, from traffic flow to neural networks. LAS 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
215: Colloquium on Automatic Control. Eds: 
Claudio Bonivento, Giovanni Marro, Roberto 
Zanasi. Springer-Verlag, 1996, x + 226 pp, 
$54 (P). [ISBN 3-540-76060-1] Invited pa- 
pers from a 1996 event at the University of 
Bologna. 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
218: L2-Gain and Passivity Techniques in Non- 
linear Control. Arjan van der Schaft. Springer- 
Verlag, 1996, 168 pp, $43 (P). [ISBN 3-540- 
76074-1] 

Applications (Systems Theory), P. Sampling 
in Digital Signal Processing and Control. Arie 
Feuer, Graham C. Goodwin. Systems & Con- 
trol: Found. & Applic. Birkhauser Boston, 
1996, xxxii + 541 pp, $74.50. [ISBN 0-8176- 
3934-9] 


Reviewers 


LB: Lynne Baur, Carleton; LC: Laura Chihara, St. Olaf; 
CEC: Clifton E. Corzatt, St. Olaf; TH: Tom Halverson, 
Macalester; LAS: Lynn Arthur Steen, St. Olaf. 
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THE AUTHORS 


JIM PITMAN acquired an appreciation of combinatorics and probability from his father Edwin J. G. 
Pitman, in Hobart, Tasmania. He was an undergraduate at the Australian National University in 
Canberra, and received a Ph.D. from the University of Sheffield. After spending two years in 
Cambridge he settled in Berkeley, California. His research involves the asymptotic description of 
random combinatorial objects, such as partitions, permutations, trees and mappings, in terms of 
Brownian motion and Poisson processes. 


KAREN HUNGER PARSHALL is Associate Professor of Mathematics and History at the University of 
Virginia and editor of Historia Mathematica. Her research focuses primarily on the history of nine- 
teenth- and early twentieth-century algebra. She is the co-author (with David E. Rowe) of The 
Emergence of the American Mathematical Research Community, 1876-1900, has recently completed an 
edition with historical and mathematical commentary of the selected correspondence of J. J. Sylvester, 
and is now at work on a full-scale biography of Sylvester. 


EUGENE SENETA is Professor in the Department of Mathematics and Statistics at the University of 
Sydney and a fellow of the Australian Academy of Science. His research interests are in branching 
processes, nonnegative matrices, regularly varying functions, and the history of probability and statistics, 
especially in France and Eastern Europe. He is the co-author (with C. C. Heyde) of I. J. Bienaymé: 
Statistical Theory Anticipated. 


GEOFFREY GOODSON was educated in Great Britain, obtaining a B.Sc. degree at Hull University, an 
M.Sc. at Warwick University, and a D.Phil. at the University of Sussex. From 1972 until 1989, he taught 
in South Africa at the Universities of Cape Town and Witwatersrand. He is now teaching at Towson 
State University. His academic interests include ergodic theory, dynamical systems, and operator 
theory. His main non-academic interests are his family and playing squash. He was Maryland 40* 
squash champion in 1992. 


THOMAS TUCKER received his BA from Harvard University in 1967 and his PhD from Dartmouth 
College in 1971. He has been at Colgate University since 1973, where he is the Charles G. Hetherington 
Professor of Mathematics. His research interests are low-dimensional topology and topological graph 
theory. For the MAA, he has served as Vice President, chaired the committee on Calculus Reform and 
the First Two Years (CRAFTY), and edited Priming the Calculus Pump. He is a co-author of the 
textbook developed by the Calculus Consortium based at Harvard. Although his real ambition was to be 
a TV weatherperson, he ended up in mathematics just like his grandfather, father, brother, and son. 


HOWARD SWANN obtained his Ph.D. in applied mathematics from the University of California at 
Berkeley. He is co-author of Prof. E. McSquared’s Original, Fantastic & Highly Edifying Calculus 
Primer, now out in its new Expanded Intergalactic Version. His research is concerned with contriving new 
computer algorithms for approximating solutions to partial differential equations. 


EDWARD B. BURGER received his Ph.D. from the University of Texas at Austin in 1990. He was a 
postdoctoral Fellow at the University of Waterloo before arriving at Williams College. His research 
interests are in number theory. At the 1996 MathFest he co-produced and co-starred in the first ever 
math comedy play (see reviews in the Nov. 1996 Notices of the AMS and in Ivars Peterson’s MathLand 
column at the MAA web site at http: / /www.maa.org /.) He is the co-author of two books: The Heart of 
Mathematics: An invitation to effective thinking and The Invisible Art, both to be published in 1998. 


FRANK MORGAN works in minimal surfaces. He has a weekly live call-in Math Chat on local cable 
TV, featured in Ivars Peterson’s column MathLand at the MAA web site at http: / /www.maa.org/, and 
a biweekly Math Chat column in The Christian Science Monitor, sometimes available via the web page 
http: / /www.csmonitor.com/. His books include Geometric Measure Theory: a Beginner’s Guide, Rie- 
mannian Geometry: a Beginner’s Guide, and Calculus Lite. 


JEFFREY NUNEMACHER entered Oberlin Conservatory as an organ major but after one semester 
transferred to Oberlin College and graduated with a B.A. in mathematics. He then studied at Yale, 
where he lived in a mathematical commune that listed itself in the New Haven phonebook as N. 
Bourbaki, and received a Ph.D. in several complex variables. After teaching at the University of Texas 
at Austin, Kenyon College, and Oberlin College, he is now Professor and Chair of the Department of 
Mathematical Sciences at Ohio Wesleyan University. He still enjoys both music and mathematics. 


ISRAEL KLEINER is professor in the Department of Mathematics and Statistics at York University. 
He received his Ph.D. at McGill University in ring theory. His interests are in the history of 
mathematics, in mathematics education (in a broad sense), and in their interface. 


292 THE AUTHORS [March 


Is the precision of math illusionary? 


Can math be wrong even if the 
computations are correct? 
How can we cope with math manipulators 


and hieu re fantasies? 


Students often have difficulty understanding what mathematics can and 
cannot do for us. This confusion can often lead to math anxiety. Now, 
there are two titles by William J. Adams of Pace University that help 
answer your students’ questions and ease their concerns with math 
manipulators and figure fantasies. 


Get a Grip on Your Math (illustrated by Ramuné 
Adams) helps put math in perspective and neutral- 
ize slippery number slingers. It addresses these 
math-based questions and more: 
3 Which numbers best reflect airline reliability? 
4 Does the $4.2 trillion national debt figure do 
more to intimidate than enlighten us? 
8 Sexuality by the numbers, or not? 
# Which polls can we trust? 
4 Does math make the case for NAFTA? 
4& How can we distinguish math babble from 
math insights? 
[1996/27 2pages/perfecV$18.95*/ISBN 1-1561] 


Get a Firmer Grip on Your Math provides food-for- 
thought questions and in-depth discussion of basic 
ideas introduced in Get a Grip. 
[1996/297pages/perfect/$ 18.95*/JSBN 1-1562] 


These two books are ideal for courses that emphasize the nature of 
mathematics—what it can do for us and its limitations. Contact Kendall/ 
Hunt Publishing Company to purchase or request a 60-day review copy. 


Phone (800) 228-0810, fax (800) 772-9165, or mail your 
order to: 


KENDALL/HUNT PUBLISHING COMPANY 
4050 Westmark Drive P.O. Box 1840 Dubuque, lowa 52004-1840 
*Prices are subject to change without notice. wdv 46717029 


PRINCIPLES of SOUND 


CREF STOCK ACCOUNT 


i h i 


10 years 


CREF EQUITY INDEX ACCOUNT 


138" 


D5" 


Since inception 


4/29/94 


CREF SOCIAL CHOICE ACCOUNT 


h h h 


5 years Since inception 
3/1/90 


RETIREMENT INVESTING 


CREF GLOBAE EQUITIES ACCOUNT 


| year 3 years Since inception 
5/1/92 


CREF GROWTH ACCOUNT 


550" | 2308" 


I year Since inception 
4/29/94 


CREF BOND MARKET ACCOUNT 


i ‘h ‘h 


Since inception 


3/1/90 


Average annual compound rates of total return (periods ending 12/31/96)* 


WHILE YOU’RE INVESTING FOR TOMORROW, 
IT’S NICE TO SEE NUMBERS LIKE THESE TODAY. 


lanning a comfortable future takes patience 


and an understanding of the long-term 
nature of retirement investing. But who can 
resist a “ttle bit of immediate gratification 
now and then? 

While past performance is no guarantee 
of future results, TIAA-CREF'’s investment 
experts seek high returns over the long term, 
at risk levels appropriate for building future 
security. 

Because variable annuities like CREF 
have no guarantees of principal or returns, 
many people choose to put some of their 
funds into the TIAA Traditional Annuity. 


It guarantees principal and interest, backed 


by TIAA's claims-paying ability, and provides 
opportunities for added growth through 
dividends. 

Other numbers to think about: With 
over $170 billion in assets and 1.8 million 
participants, we re the largest retirement 
organization in the world, and the leading 
choice in higher education and research. 

For over 75 years, we've pioneered 
new and better ways to help you build a 
comfortable future. If this all sounds good 
to you, there's one more number to consider: 
1 800 842-2776. Call us to learn how our 
expertise can work to your advantage... 
today and tomorrow. 


Visit us on the Internet at www.tiaa-cref.org 


Ensuring the future 
for those who shape it.” 


©1997 Teachers Insurance and Annuity Assoctation/College Retirement Equities Fund. 730 Third Avenue, New York, NY 


*The total returns shown for the CREF variable annuity accounts represent past performance Total returns and the principal value of investments in the 

accounts will fluctuate, and yields May Vary Upon redemption, your accumulation units may be worth more or less than their original price Investment 

results are after all investment, administrative, and distribution expenses have been deducted I-tfective April 1, 1988, a registration statement for CREF 
variable annuities became effective, under the rules and regulations of the Securities and exchange Commission but CRIF's management and its objectives 
did not change. CREF certificates are distributed by TIAA-CREF Individual and Institutional Services For more complete information, including charges 
and expenses, call ] 800 842-2733, ext 5509, for a prospectus Read the prospectus carefully before you invest or send money 1/97 
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The Pleasures of Counting 
T.W. Korner 


Ranging from the design 
of anchors and the Battle 
of the Atlantic to the 
outbreak of cholera in 
Victorian Soho, this text 
describes a variety of lively 
topics that continue to 
intrigue professional 
mathematicians. 


The Measures of Caintinygs 


E® KORG) 
1996 544 pp. 
56087-X Hardback $85.00 
56823-4 Paperback $34.95 


A Primer of Probability Logic 
Ernest W. Adams 


This is a clear well-written text on the subject of probabil- 
ity logic, suitable for advanced undergraduates or gradu- 
ates, but also of interest to professional philosophers. 


CSLI Lecture Notes 
1997 
1-57586-066-X 


c.376 pp. 


Paperback $24.95 


Young Tableaux 
With Applications to Representation Theory and Geometry 


William Fulton 


The aim of this book is to develop the combinatorics 

of Young tableaux and to show them in action in the 

algebra of symmetric functions, representations of the 
symmetric and general linear groups, and the geome- 

try of flag varieties. 

London Mathematical Society Student Texts 35 


1996 269 pp. 
56144-2 Hardback $59.95 
56724-6 Paperback $19.95 


Mathematical Cavalcade 


Brian Bolt 


131 activities, ranging from matchstick and coin puzzles 
through ferrying, railway shunting, dissection, topologi- 
cal and domino problems to a variety of magical num- 
ber arrays with surprising properties are included. 


1992 130 pp. 


42617-0 Paperback $18.95 


Available in bookstores or from 


CAMBRIDGE 


UNIVERSITY PRESS 


A Pathway Into 
Number Theory 


Second Edition 
R.P. Burn 


Now in its second edition, 
this book consists of a 
sequence of exercises that 
will lead readers from sim- 
ple number work to the 
point where they can prove 
algebraically the classical 
results of elementary number theory for themselves. 
1996 267 pp. 
57540-0 Paperback 


$27.95 


Complex Variables 


Introduction and Applications 


Mark J. Ablowitz 
and Athanassios S. Fokas 


Part one provides an introduction to the subject, 
including analytic functions, integration, series, 

and residue calculus and also includes transform 
methods, ODEs in the complex plane, and numeri- 
cal methods. Part two contains conformal mappings, 
asymptotic expansions, and the study of Riemann- 
Hilbert problems. 

Cambridge Texts in Applied Mathematics 16 


1996 ¢.450 pp. 


48523-1 Paperback $34.95 


The Principles of 
Mathematics Revisited 
Jaakko Hintikka 


This book, written by one of philosophy’s preemi- 
nent logicians, argues that many of the basic 
assumptions common to logic, philosophy of 
mathematics and metaphysics are in need of change. 
Hintikka proposes a new basic first-order logic and 
uses it to explore the foundations of mathematics. 


1996 300 pp. 


49692-6 Hardback $59.95 


40 West 20th Street, New York, NY 10011-4211 
Call toll-free 800-872-7423 MasterCard/VISA accepted. 


Prices subject to change. Web site: http://www.cup.org 


American Mathematical Society 


Groups and Symmetry: 
A Guide to Discovering 
Mathematics 


David W. Farmer, Bucknell University, 
Lewisburg, PA 


Knots and Surfaces: A Guide 
to Discovering Mathematics 


David W. Farmer, Bucknell University, 
Lewisburg, PA, and Theodore B. Stanford, 
University of Nevada, Reno 


The book is perfectly suited to a course for non-science 

majors in need of fulfilling a math requirement. All the 
sections have worked well at sparking student interest 

and convincing them that math 1s much more interest- 
ing than mere number-crunching and graphing. 


—William Bloch, Wheaton College 


In most mathematics textbooks, the most exciting 
part of mathematics —the process of invention 
and discovery—is completely hidden from the 
reader. The aim of Groups and Symmetry and Knots 
and Surfaces is to change all that. By means of a 
series of carefully selected tasks, these books lead 
readers to discover some real mathematics. There 
are no formulas to memorize; no procedures to 
follow. The books are guides: their job is to start 
you in the right direction and to bring you back if 
you stray too far. Discovery is left to you. 
Mathematical World, Volume 5 (Farmer); 1996; 

102 pages; ISBN 0-8218-0450-2; List $19; All AMS 
members $15; Order code MAWRLD/5MAA97 
Mathematical World, Volume 6 (Stanford); 1996; 

101 pages; Softcover; ISBN 0-8218-0451-0; List $19; All 
AMS members $15; Order code MAWRLD/6MAA97 


How to Teach Mathematics: a 
personal perspective 


Steven G. Krantz, Mathematical Sciences 
Research Institute, Berkeley, CA 


... an original contribution to the educational literature 
on teaching mathematics at the post-secondary level. 
The book itself is an explicit proof of the author’s claim 
“teaching can be rewarding, useful, and fun.” 


—Zentralblatt fiir Mathematik 


1993; 76 pages; Softcover; ISBN 0-8218-0197-X; 
List $15; All AMS members $12; Order code 
HTMMAA97 


Mathematics and Sports 
L. E. Sadovskif and A. L. Sadovskii 


... a nice survey of applications of mathematics in 
sporting events. . . 
f § event —Mathematical Reviews 


Treatment is concise and insightful. 

——Zentralblatt fiir Mathematik 
This unique book presents simple mathematical 
models of various aspects of sports, with applica- 


tions to sports training and competitions. 
Requiring only a background in precalculus, 


it would be suitable as a textbook for courses 

in mathematical modeling and operations 
research at the high school or college level. 
Coaches and those who participate in sports 
will find it interesting as well. The lively writing 
style and wide range of topics make this book 
especially appealing. 

Mathematical World, Volume 3; 1994; 152 pages; 
Softcover; ISBN 0-8218-9500-1; List $19; All AMS 
members $16; Order code MAWRLD/3MAA97 


Techniques of Problem Solving 


Steven G. Krantz, Washington University, 
St. Louis, MO 


The purpose of this book is to teach the basic 
principles of problem solving, including both 
mathematical and nonmathematical problems. 
This book will help students to ... 


¢ translate verbal discussions into analytical data. 


¢ learn problem-solving methods for attacking 
collections of analytical questions or data. 


¢ build a personal arsenal of internalized 
problem-solving techniques and solutions. 


¢ become “armed problem solvers”, ready to do 
battle with a variety of puzzles in different 
areas Of life. 


1996; 465 pages; Softcover; ISBN 0-8218-0619-X; 
List $29; All AMS members $23; Order code 
TPSMAA97 


The Way I Remember It 
Walter Rudin, University of Wisconsin, Madison 


Walter Rudin’s memoirs should prove to be a 
delightful read specifically to mathematicians, but 
also to historians who are interested in learning 
about his colorful history and ancestry. As those 
who are familiar with Rudin’s writing will recog- 
nize, he brings to this book the same care, depth, 
and originality that is the hallmark of his work. 


Co-published with the London Mathematical Society. 
Members of the LMS may order directly from the 
AMS at the AMS member price. The LMS is registered 
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Rectangular Invertible Matrices 


A. J. Berrick and M. E. Keating 


In everyday mathematics, a matrix with a two-sided inverse must be a square 
matrix. However, there are situations in which rectangular “invertible” matrices do 
occur, and our purpose in this note to give a brief survey of such situations and the 
matrices that arise in them. 

The usual rule for the multiplication of real matrices works equally for matrices 
with entries in any ring R, and we say that an m X n matrix P, having entries in 
R, 1s invertible if there is a matrix Q with entries in R so that 


and OP=I1 


where /,, and I, are identity matrices of the correct sizes. The matrix Q is called 
the inverse of P. If it exists, it must be an n X m matrix and it will be unique. 

Whether or not an invertible matrix is necessarily square is a property that 
depends on the ring from which the entries of our matrices are selected. If every 
invertible matrix with entries in R must be square, then R is said to have invariant 
basis number, a term that we justify shortly. Familiar rings, such as fields or the 
integers, do enjoy this property, and some fundamental results show that the 
property can often be transmitted from one ring to another. A consequence is that 
rings with invariant basis number form a large class of rings, encompassing most of 
those encountered by a working mathematician. 

To redress the. balance, we give some examples of rings that do not have 
invariant basis number, and we show that they can be classified in terms of a 
congruence relation on the set of natural numbers, which tells us the permitted 
sizes of invertible matrices for that ring. Such a congruence relation is in turn 
determined by its “type” (w, d), which may then be called the type of the ring. 

This classification of rings by type is useful in determining whether there can be 
a ring homomorphism between rings of differing types, and leads to the question of 
the possibility of representing a ring of one type as a subring of a ring of another 
type. 

Our results (apart from applications) are not new, and our derivation of the 
theory of congruences on N is elementary, but there does not seem to be any 
account that contains this result in conjunction with a discussion of rings without 
invariant basis number. 

All rings in this paper are associative and possess an identity element; ring 
homomorphisms must preserve identity elements. We take the natural numbers to 
be N = {1,2,...}. 


1. We start with a justification of the terminology and a survey of results concern- 
ing invariant basis number and the lack of it. 

For any integer n > 1, we write R” for the space of column vectors of length n 
with entries in R; more formally, R” is the free right R-module of rank n. As in the 
theory of vector spaces, the standard unit vectors e,,...,e, are linearly indepen- 
dent and span R”, that is, they form a basis of R”. 
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Now let P be an m Xn matrix with entries in R. There is a linear transforma- 
tion from R” to R” defined by the rule x > Px, and, again as in vector space 
theory, if P is invertible, the vectors Pe,,..., Pe, then form a basis of R”™. 
However, R”™ already has a basis with m members, so if the number of elements in 
a basis of R” is invariant, we must have m = n, that is, our invertible matrix has to 
be square. 

The argument also works in the reverse direction: if we do have a basis of R” 
with n members, we can find an invertible m X n matrix P that gives the members 
of the basis in the form Pe,,..., Pe,. 

Thus we see why the expression “R has invariant basis number” is synonymous 
with the assertion that an invertible matrix over R must be square. Henceforth we 
abbreviate invariant basis number to JBN. 

Our discussion also shows that a field must have IBN, since the number of 
elements in a basis of a vector space is just its dimension, which we know to be 
unique. 

Further examples of rings that have IBN are provided by the observation that if 
f: R — S is a ring homomorphism and P is an invertible matrix with entries in R, 
having inverse Q, then the image matrix fP is an invertible matrix with entries in 
S, with inverse fQ. Thus, if S has IBN so also has R. 

For example, suppose that R is commutative. Then it is a fact that there is a 
homomorphism R~— F for some field F ({1], p. 3) and so R has IBN. This 
conclusion can also be proved by a direct calculation with matrices ((6], §10.4). 

A deeper result is that (right) Noetherian rings have IBN ((4], 4.3, Theorem 7). 
Thus, “naturally occurring” rings tend to have invariant basis number. 

The easiest example of a ring without IBN is provided by the cone C(R) of a 
ring R. The elements of C(R) are the infinite matrices over R, with rows and 
columns indexed by the natural numbers, each row and column having only a finite 
number of nonzero entries. The elements 


100000::- 010000::: 
_ |001000::- | - _ |000100::- 
B= and y= 


000010-:: 000001-:: 


of C(R) give an invertible 2 x 1 matrix a = (! ). with inverse the transpose 
matrix ( B‘ y‘). 

Further examples of rings without IBN are not so easy to find. Some can be 
constructed as follows. Choose (unequal) natural numbers m and n, and let 
X = (X,;), m Xn, and Y = (¥;,), n X m, be matrices whose entries are non-com- 
muting indeterminates. We can then form a noncommutative polynomial ring 
(otherwise known as a free associative algebra) K{ X,Y) by adjoining the 27mn 
entries of the matrices X and Y to a field K. Now let the Leavitt ring L(m, n) be 
the quotient ring of K( X,Y) obtained by equating to 0 all the entries of the 
matrices XY — I, and YX — I,. By design, the ring L(m, n) has a pair of mutually 
inverse rectangular matrices, of sizes m X n and n X m, namely the images of X 
and Y respectively. What is not at all clear is that L(m, n) is non-trivial and that it 
has no smaller pair of mutually inverse rectangular matrices. Proofs of these 
assertions are given in [10] and [11], and, using different methods, in [3] and [2]. 


2. Given a ring without IBN, it is natural to ask if there is a non-square invertible 
matrix over it.of minimal size, and, if so, whether the possible sizes of non-square 
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invertible matrices can be determined in terms of this minimal size. The answer to 
both questions is in the affirmative, which fact leads to a classification of rings 
without IBN. 

The questions are tackled by associating with any ring R, having IBN or not, a 
relation ~p on the set N of natural numbers that is defined by the existence of 
invertible matrices. 

Given a,b < N, we say that a ~, b if and only if there is an a X b invertible 
matrix P over R. The relation ~, is clearly reflexive, symmetric (consider the 
matrix inverse to P), and transitive Gif the b X c matrix Q is invertible, so is PQ). 
Thus ~p is an equivalence relation. 

The relation ~, has the further property that it is also a congruence on N. 
By definition, such a relation ~ is an equivalence relation in the ordinary sense, 
and it Satisfies the extra, additivity, property that if a~b and c~d, then 
at+c~b-+d also. 

To see that ~, is a congruence, note that if P is an a X b invertible matrix 


and Q is ac X d invertible matrix, then the block matrix (é 6 
of size (a +c)x(b + ad). | 
When & has IBN, the relation ~, is simply equality. 


is also invertible, 


3. Now we turn to the description of a general congruence on N. We start by 
relating congruences on N with the more familiar congruences on the set Z of all 
integers. If we replace N by Z in the definition, it is easy to see that a congruence 
as defined here is simply congruence mod d in the usual sense, where d is the 
smallest natural number congruent to 0 if the congruence is not equality; d is 
called the modulus of the congruence. 

Suppose that ~ is a congruence on N. Then we can extend ~ to a 


congruence = on Z by saying that a = b if there is a natural number x so that 
atx~bt+x. 
It is routine to verify that = is in fact a congruence on Z. So, if ~ is not 


equality, we can associate with ~ the modulus d of the extension congruence = 
on Z. 

To avoid confusion, we shall always use ~ for acongruence on N and = fora 
congruence on Z. 

Working in the opposite direction, given any natural number d we can obtain a 
congruence ~ onN by restricting to N an ordinary congruence mod d on the set 
Z. For such a congruence on N, the congruence classes are 


{e,et+d,e+2d,...}, e=1,...,d. 
In general, a congruence relation on N is a mixture of equality and a restriction 


of a congruence mod d from Z. More precisely, we prove the following result. 


Theorem 1. Let ~ be a congruence relation on N that is not equality. Then there are 
unique natural numbers w and d so that the set of equivalence classes under ~ 
comprises w — 1 classes 


{1},...,{w — 1}, 
each of which has a single element, together with d infinite classes 
{e,et+d,e+2d,...}, e=w,....wtd-—l. 


Conversely, any such partition of \ gives a congruence relation on N. 
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The number w is the least natural number congruent to a different natural number 
under ~ , and the number d is the modulus of the extension of ~ to Z. 
The pair (w, da) is called the type of the congruence. 


Proof: It is straightforward to check that a partition of N as in the assertion does 
define a congruence on N. 

Working in the other direction, suppose ~ to be a congruence relation on N 
that is not equality, and let d be the modulus of the extension congruence on Z. 
Then there exist natural numbers x with x ~ x + d. So define w to be the least 
such x. Clearly, for any e > w, 


e=wt+(e-—w)~wt+dt(e-—w)=e+td, 
so the equivalence class of e contains 
{e,e+d,e+2d,...}. 


For u < w, we show that the equivalence class of u is a singleton. Otherwise, by 
the additivity property it is infinite and we may take v to be its least member 
exceeding w — 1. So by the previous paragraph, v ~ v + d. Then from u ~ v we 
have u+d~ovo+d. Thus by transitivity u ~ u + d, contradicting the definition 
of w. 

Thus the singleton classes are 


(1},...,{w-1 


and the remaining classes are as claimed. | 


Remarks 


(a) The congruences on N that are obtained by restriction from a congruence 
on Z are those of type (1, d@). 

(b) It is convenient to give equality the type (~, 0). 

(c) The theorem is known to semigroup theorists, but there does not seem to 
be any account of it in elementary texts on the number system. It can be 
deduced from general results on semigroups given in several texts, for 
example, [8], §1.1.2 and Theorem 5.3. The theorem is stated without proof 
by Cohn in [3]; he also gives an elementary proof in Lemma X.3.1 of [5] in 
the midst of a more sophisticated discussion. 


4. A ring R said to have the type (w, d) of the congruence ~, on N defined in 
Section 2. This gives us our desired classification of rings that do not have IBN. 
Thus the rings L(m, n) have type (m,n — m) (taking m < n) and the cone of any 
ring has type (1, 1). 

Notice that the rings with IBN are precisely those of type (0, 0), since this is the 
type of the equality congruence. 

As an application of the notion of the type of a ring, we compare the types of 
rings R and S when there is a ring homomorphism f: R — S. If a ~, b, then 
a~s b also, since the images of mutually inverse matrices are again mutually 
inverse. Thus the relation ~, is finer than ~,. 

It follows that each ~, -equivalence class is a disjoint union of ~p -equivalence 
classes. Therefore, if (w, d) and (u,c) are the respective types of R and S, then 
u<w and c divides d. (Here, we take the symbol © to exceed all natural 
numbers.) 
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In the reverse direction, if (w, d) and (u, c) are the respective types of abstract 
congruences ~ and = onWN, and if u < w and c divides d, then it is easy to see 
that ~ is finer than =. 


5, As an application, we recover a result that we proved in Section 1: if there is a 
homomorphism f: R — S and S has invariant basis number, so also has R. Take 
(w, d) to be the type of R. Since S has. type (0,0), we have ~ < w and O|d, which 
forces the equalities » = w and 0 = d. 

In particular, if S is either commutative or Noetherian, then R has IBN. 

We can also give a reason why equality and rings with IBN must be assigned the 
type (c%, 0). On the one hand, a field K has IBN, and, on the other hand, for any 
pair of natural numbers uw and c, K can be embedded in the Leavitt ring 
L(u, u + c) of type (u, c). Thus the type (w, d) of K must have the properties that 
w exceeds every natural number and that d is divisible by every natural number. 


6. Here are some exercises for readers who know some ring theory. 


(a) Suppose that D = R, X R,, the direct product of rings, and that R, and R, 
have types (w,, d,) and (w,, d,) respectively. Then D has type (max(w,, w,), 
Icm(d,, d,)). 

(b) If R has type (w, d) # (, 0), then the ring M,(R) of n < n matrices over R 
has type ((w/n),d/gcd(d,n)), where <w/n) is the least integer with 
(w/n) >w/n. Thus if n is a large multiple of d, M,(R) has type (1, 1). 

Note that the type of a ring is not preserved under Morita equivalence of 
rings (see [4], §4.5, for a definition of this term). 

(c) Suppose that a ring R is the union R = U, R; of a set of subrings. Then 
there is a natural number N so that the rings R, for i > N all have the same 
type, which is the type of R. The corresponding statement also holds for 
more general direct limits of rings. 

Inverse limits seem to be less well behaved. 

(d) Let rad(R) be the Jacobson radical of R, that is, the intersection of all the 
maximal right ideals of R. Then R has the same type as R/rad(R). 

Hint: It suffices to show that if an n Xk matrix X over R/rad(R) is 
invertible, then there is some invertible n X k matrix over R that maps to X 
under the canonical surjection from R to R/rad(R). In fact, any n < k matrix 
A over R that maps to X must be invertible. To see this, regard A as an 
R-module homomorphism from R* to R” of free right R-modules. Since 


X(R/rad(R))* = (R/rad(R))", 
we have 
AR* + (rad(R))" = R". 


By Nakayama’s Lemma ([4], §10.3), AR* = R”, that is, A is a surjection. Thus 
there isa k Xn matrix B with AB = /,. But B must map to the unique inverse 
of X, so the same argument shows that B has an n Xk right inverse, which 
must be A again. 


7. There is no reason why there should be a homomorphism from a ring R to a 
ring S even when ~, is finer than ~,. For example, there are no homomor- 
phisms between the finite fields F, and TF, if p and gq are different primes, 
although they have the same type. 
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An interesting question is the extent to which rings of one type can usefully be 
represented in terms of rings of another type. For instance, the cone C(K) over a 
field K, which has “terminal” type (1, 1), contains subrings of all other types — this 
follows from Proposition 2.1 of [7], since the Leavitt rings over K all have 
countable dimension. However, it is not clear how the properties of a Leavitt ring 
can be recovered from its embedding in the cone. 

A result in the reverse direction is given in [12]. 


8. A ring R is said to have stable rank 1 (sr R = 1) if for any pair of elements 
Ay, 4, such that R = Ra, + Ra,, there exists b © R with ba, + a, a unit in R. In 
such a ring, the relation xy = 1 always yields yx = 1 (after ay = 1 — yx, a, =x) 
[9]. Moreover, if sr R = 1, then for all n we have sr M,(R) = 1 [13]. So suppose 
that P is an m Xn matrix over R with sr R = 1, and that PQ =I, for some 


matrix Q over R. By adjoining n — m zero columns to P, we obtain P € M VR), 
which has a right inverse Q. Since sr M,(R) = 1, Q must also be left inverse to P, 
in contradiction of the fact that OP must contain at least » — m zero columns. It 
follows that a ring R with stable rank 1 has IBN. 
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Equal Pay for All Prisoners 


Maarten C. Boerlijst, Martin A. Nowak, and Karl Sigmund 


By prisoners we mean, of course, players of the well-known Prisoner’s Dilemma 
game (to be described presently). We shall show that there exist simple strategies 
for the infinitely iterated Prisoner’s Dilemma that act as equalizers in the sense 
that all co-players receive the same payoff, no matter what their strategies are like. 

The Prisoner’s Dilemma game, a favorite with game theorists, social scientists, 
philosophers, and evolutionary biologists, displays the vulnerability of cooperation 
in a minimalistic model (see [1] to [5]). The two players engaged in this game can 
choose whether to cooperate or to defect. If both defect, they gain 1 point each; if 
both cooperate, they gain 3 points; but if one player defects and the other does 
not, then the defector receives 5 points and the other player only 0. The right move 
is obviously to defect, no matter what the other player does. As a result, both 
players earn 1 point instead of 3. 

But if the same two players repeat the game very frequently, there exists no 
Strategy that is best against all comers. The diversity of strategies is staggering. If 
we simulate on a computer populations of strategies evolving under a mutation- 
selection regime (with mutation introducing new strategies and selection weening 
out those with lowest payoff), we observe a rich variety of evolutionary histories 
frequently leading to cooperative regimes dominated by strategies like Pavlov 
(cooperate whenever the opponent’s move, in the previous round, matched yours) 
or Generous Tit For Tat (always reciprocate your opponent’s cooperative move, 
but reciprocate only two-thirds of the defections). Remarkably, all strategies of the 
iterated Prisoner’s Dilemma, which can be very complex and make up a huge set, 
obtain the same payoff against some rather simple equalizer strategies. 

More generally, let us consider a two-player game where both players have the 
same two strategies and the same payoff matrix. We denote the first strategy (row 
1) by C (for ‘cooperate’) and the second (row 2) by D (for ‘defect’) and write the 
payoff matrix as 


Opponent 
D 


(1) 


Such games include the Prisoner’s Dilemma, where T > R>P> S, and the 
Chicken game, where T > R > S > P. (In the Prisoner’s Dilemma case, R stands 
for the reward for mutual cooperation, P is the penalty for mutual defection, T is 
the temptation payoff for unilaterally defecting and S the sucker payoff for being 
exploited.) 

_ Let us assume that the game is repeated infinitely often. A strategy in such a 
supergame is a program telling the player in each round whether to play C or D. 
The program may be history-dependent and stochastic: it specifies at every step 
the probability for playing C, depending on what happened so far. If A, is the 
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payoff in the n-th round, the expected long-run average payoff for a player is given 
by 
A, +: +A 
lim ——————. (2) 
N->ow N 


provided it exists. It need not always exist: think of two players cooperating in the 
first 10 rounds, defecting in the next 100 rounds, then cooperating in the following 
1000 rounds, etc. ; 

Memotry-one strategies are particularly simple. Such a strategy is given by the 
probability to play C in the first round, and a quadruple p = (pr, Ps, Pr, Pp); 
where p; denotes the probability that the player plays C after having experienced 
outcome i € {R,S,T, P} in the previous round. Some of the most successful 
strategies belong to this class, including Generous Tit For Tat (1,1/3,1 ,1/3) and 
Pavlov (1, 0, 0, 1). 


Theorem. Jf max(S, P) < min(R,T), then there exist, for every value 7 between 
these numbers, memory-one Strategies p Such that every opponent obtains the long-run 
average payoff 7 against a player using such a strategy. The vector p is given by 


(1-(R-w)a,1—(T-— w)a,(a7—-S)a, (a7 — P)a) (3) 


where a is any real number such that 1/a > max(T — 7, R — 7, 7 — S, 7 — P). 


Proof: The condition on a guarantees that the p,; are probabilities. Let us denote 
by q,(n) the conditional probability that the opponent plays C in the following 
round, given that the n-th round resulted in outcome i, and by s,() the probability 
that the outcome in the n-th round is 7. By conditioning on round n, we obtain: 


Sa(m + 1) = Sp(1)ae(™)[1 — (R- wa] + ss(n)qs()[1 — (T- 7 )al 
+59(n)ap(n)(4— S)a + 5p(n)ap(n)(m — Pa. (4) 
Similarly, 
Ss(n + 1) = sa()(1 — aa(1))[1 -— (R- 7a] 
+ ss(n)(1 — 4s(n))[1 — (1 - )a] 
+ sp(n)(1— ap(n))(7 — S)a + sp(n)(1 — ap(n))(m — P)a. (5) 
Summing (4) and (5) yields the probability that you play C in round n + 1 
Sp(n + 1) + 55(n +1) =5p(n)[1 -— (R - w)a] + 55(n)[1 -— (T- w)a] 
+ s7(n)(a— S)a + sp(n)(7 — P)a. 
Hence 
a~'[sp(n) + ss(n) — Sap(n + 1) — S5(n + 1)] = 
Rsp(n) + Ssp(n) + Tss(n) + Psp(n) — w[Sp(n) + 55(1) + 87(n) + 5p(n)]. 
(6) 


Since the s,(m) sum up to 1, the right-hand side is just A, — aw, where A, is the 
opponent’s payoff in the n-th round (we must bear in mind that one player’s 
outcome S is the other player’s outcome T). Summing up (6) for n = 1,..., N and 
dividing by N, we obtain 


1 A, + 
—y lsat) + $5(1) ~ se(N + 1) ~ 8g(N + 1)] = —— 
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and hence 
, A, ++: +An 
irae N TT. a 

A few final remarks. Two players using equalizer strategies are in Nash 
equilibrium, which means that neither has an incentive to change strategy. Nash 
equilibria exist for every game; for iterated games, they abound. Indeed, the 
so-called Folk Theorem in game theory states that every feasible pair of payoff-val- 
ues exceeding the minimax (the highest payoff that a player can enforce, which in 
our case is max(S, P)) can be realized by a Nash-equilibrium pair [2, p. 373]. Our 
theorem is related to this: the strategies are equalizers with memory one. Two 
players using such strategies have no reason to switch unilaterally to another 
strategy, since they cannot improve their payoff; however, they have no reason not 
to adopt another strategy either, since they will not be penalised. Since their 
Opponent plays an equalizer strategy, they can switch to any other strategy, and 
not be worse off. If both players opt for a change, however, they are likely to end 
up in a non-equilibrium situation. 

If a is chosen small enough, the runs of consecutive defections or cooperations 
can be made arbitrarily long. The condition min(R, T) > max(S, P) and its con- 
verse are not only sufficient, but also necessary for the existence of such equalizer 
Strategies. It is easy to construct other equalizer strategies. For example, play C 
until the opponent’s mean payoff is larger than 7, then play D until it is smaller 
than 7, then play C until it is larger again, etc. However, such a strategy requires 
monitoring the opponent’s entire payoff sequence. The point is that even within 
memory-one strategies, equalizers exist. 
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The Long and the Short 
on Counting Sequences 


Jim Sauerberg and Linghsueh Shu 


1. INTRODUCTION. Consider the sequence of positive integers S$, = 2,1,1,4. S, 
consists of two 1’s, one 2, and one 4, so let us define S$, to be this description: 
S, = 2,1,1,2,1,4. Repeating this process, S$, consists of three 1’s, two 2’s and one 
4,so set S, = 3,1, 2,2, 1,4. Continuing in this way for several more steps produces 


S, = 2,1,2,2,1,3,1,4 
S, = 3,1,3,2,1,3,1,4 
S; = 3,1,1,2,3,3,1,4 
S_ = 3,1,1,2,3,3, 1,4. 


In general, given any finite sequence of positive numbers S,, this process of 
constructing S,,, to be the sequence that counts how many times each number in 
S; appears in S; creates a counting sequence {S;},,9. AS the reader certainly 
noticed, in our counting sequence we have S; = S$, = §, = -:-. In fact, in any 
counting sequence, because S;,, is uniquely determined by S,, if there exist 
numbers p and i such that S$; = S;,,, then S$, = S,,, for all i’ > 7. We then say 
that {S,},. 9 is ultimately periodic. The rather surprising main result of [1] is 


Theorem 1. For any finite sequence of positive integers S,, the associated counting 
sequence {S;}.. 9 is ultimately periodic. In other words, given S, there are integers py 
and p so that S;.., =S; for alli = po. 


The smallest p, and smallest p satisfying Theorem 1 are called the pre-period 
and the period of the counting sequence {S,}. Then a periodic counting sequence of 
period p, or simply a p-cycle, is a counting sequence of pre-period 0 and period p. 
For example, the counting sequence corresponding to S) = 2,1, 1,4 has pre-period 
5 and period 1, that is, it “ends” in a 1-cycle. Similarly, the counting sequence 
corresponding to S, = 5,6 ends in a two-cycle, and the counting sequence corre- 
sponding to S, = 6,7 ends in a three-cycle. 

Several different types of counting sequences have been studied in recent years 
(see [1], [5], [6], [7], [8], and M4779 in [9]). In this paper we consider these counting 
sequences, bring out their connections, and explore the periodic behavior of each. 
To expand on this, the questions we answer are: 


1) What are the possible periods p? For each p, how many p-cycles are there? 
In Section 3 we find all possible periods and classify all cycles. Partial 
answers to these questions are given in [6]. 

2) A puzzle of Raphael Robinson [3, pp. 389-390] asks the reader to place 
numbers in the blanks so that the following is true: “In this sentence, the 
number of occurrences of 0 is —, of 1 is —, of 2 is —, of 3 is —, of 4 is _, 
of 5 is _, of 6 is —, of 7 is —, of 8 is —, and of 9 is __.” To find such a 
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sentence we must find a one-cycle that contains all of the numbers in base 
10, as opposed to the infinite base consisting of all the natural numbers 
implicitly used in the preceding paragraphs. More generally, one can build 
counting sequences in base k for any k > 2. Are such counting sequences 
also eventually periodic? In Section 4 we show that they are, and determine 
exactly how many different cycles there are in each base. This expands upon 
the results of [6]. 

3) What happens when Sj, is replaced by an infinite sequence? It is very easy to 
give infinite sequences S, such that {S,}.. 9 is not well-defined. In Section 5 
we show how to construct examples of infinite sequences S, so that {S;},. 9 is 
well-defined and is ultimately periodic. We also give two different methods 
for constructing infinite sequences S, so that {S}.., is well- 
defined and but not ultimately periodic. 

4) The second term, fourth term, sixth term, etc., in each sequence S, of a 
counting sequence do little more than serve as place holders. Assuming there 
is a way to tell which integer each number is describing, what happens if we 
form counting sequences without these place holders? One can then ask 
questions similar to those in 1) for these sequences. These questions have, 
for the most part, been answered in [5], [7], and [8]. We see in Section 6 that 
the answers also follow as very simple corollaries of our work in Sections 2 
and 3. Robinson’s question can also be asked in this context; see [10]. 


In each of the various methods we use to construct counting sequences, the 
successor sequence lists the number of appearances of a particular digit through- 
out the entire previous sequence. It is also possible to construct counting se- 
quences in which the successor lists the number of consecutive appearances of a 
digit: if C) = 2,1,1,4, then C, = 1,2,2,1,1,4 and C, = 1,1,2,2,2,1,1,4. See [2] 
for Conway’s analysis of such counting sequences. 


2. BASIC PROPERTIES OF COUNTING SEQUENCES. We begin by giving 
several important properties of the sequences making up a counting sequence, and 
then give a simple proof of Theorem 1. Fix a finite sequence of positive integers Sp 
and let {S,},. , be the corresponding counting sequence. For i > 1 we write S, as 


S;= i145 fii. Mi,2> fica» Mi, n,s fin; 


We assume the f; ;’s are in increasing order and leave out commas to unclutter the 
notation when there is no risk of confusion. The positive integer m; , is called a 
multiplier of S; and indicates that the integer f; ;, called a factor of Si, appears 
exactly m, , times in S;_,. Let |S,| = 2n,; be the total number of terms in S;. The 
following ‘Observations about the S,’s are used often, and frequently without 


mention. Similar facts are proved in [1] and [6]. 
Proposition 2. Fix S, and let {S,}.. be the corresponding counting sequence. Let 
i>. 


1) For each factor f, , of S; there are m; ; — 1 or m, , multipliers of S;_, with the 
value f; ;, depending on whether or not the value fi, appears as a factor in S,_,. 


2) We have |S,_,| = vitym, ; and |S;_,| < |Sjl, because every factor of S,_, is also 
a factor of S,. 

3) If{S), <i< p constitutes a p-cycle, then |S;| = |S;,.1| fr alli and each §S, in the 
cycle has exactly the same factors. Further, |S;_,| = U7L,0m;,, — Df; ;- 


1997] THE LONG AND THE SHORT ON COUNTING SEQUENCES 307 


To show how these facts will be used we provide the following proof of 
Theorem 1. 


Proof of Theorem 1: Fix S,, and let max(S,) be the value of the largest term in S,. 
Clearly either max(S,) = max(S,) for all i> 2, or there is some i such that 
max(S,,,) > max(S,). First assume the former. For i> 2, the number of se- 
quences S, with max(S,) <n for any particular n is at most (n + 1)”, and so is 
finite. Since S;,, is completely determined by S,, we then see that the counting 
sequence {S.}.. , must eventually repeat, and so enters a cycle. 

So now suppose max(S,,,) > max(S;) for some i > 2, and choose n so that 
n+1isaterm in S,,, and is larger than every term in S,. Since n + 1 can appear 
in S;,, only as a multiplier, S,; has at least m + 1 equal terms. But clearly 
|S;| < 2n, and since i => 2, the factors in S,; are distinct. It must therefore be the 
case that all of the multipliers of S; are equal, |S,| = 2n, and each of the integers 


from 1 to n appears as a multiplier in S,. Write S; = m,1,m,2,...,m,n for some 
m > 1. Then mn = U?_jm =|S;_,| <|S;| = 21 shows m < 2. 
If m = 2 then 


3 


2n = |S;_,|= ¥ (m-1)f.= rf 
j=l 


J 
j=l 1 


j 
shows that n < 3, and that S$, must be 2,1 or 2,1, 2,2 or 2,1, 2, 2, 2,3. A counting 
sequence containing any of these is easily shown to converge to 2, 1,3, 2, 2,3,1,4, a 
one-cycle. A similar argument shows that if m = 1 and i => 2, then S,_, = 1,2 or 
1,2, 3,4 or 1, 2,3, 4, 5, 6, all of which lead to periodic counting sequences. a 


3. CYCLES AND THEIR TRUNCATIONS. Theorem 1 ensures that no matter 
what finite sequence §, of positive integers we begin with, the counting sequence 
associated to S, is ultimately periodic, that is, ends in a cycle of some period p. 
We now determine the possible periods, and for each p classify the p-cycles. As 
the word “classify” hints, there are actually infinitely many different cycles, and the 
sequences in these cycles may be arbitrarily long. Fortunately there are only three 
possible periods, and each cycle has a companion cycle made up of very short 
sequences. We will use these truncated sequences to make our classification. 

Fix a p-cycle, and for ease, rename the sequences in it S,,5S,,...,5,. We first 
show that 1 occurs as a term in each S,, unless the cycle is the one-cycle $, = 2, 2. 
This implies that the multiplier of the factor 1 will play an important role in our 
classification. 


Lemma 3. Either 1 occurs at least twice in each S;, or p = 1 and S, = 2,2. 


Proof: First suppose no S; has 1 as a factor, so all of the multipliers in each S;, 
have values larger than 1. Let |S,| = 2n. Since the sum of the 1 multipliers of S, 
equals |S,_,| = |S;| = 2n, all of the multipliers of S; must equal 2. This is true for 
all 7. But then all of the S,’s have exactly the same multipliers, all of value 2, and 
exactly the same factors, so this cycle is the one-cycle $, = 2, 2. 

Next, when one S, has 1 as a factor, then each S, does. If S;,, contains exactly 
one 1, for some i, then none of $;,’s multipliers equal 1. Again, U7_,m; ; = |S;-,| = 
|S.| = 2m then implies that all of the multipliers of S; have the value 2. However, 
as in the proof of Theorem 1, a counting sequence containing such an element 
converges to the one-cycle 2, 1,3, 2, 2,3, 1,4, which is not equal to $;,,, contradict- 
ing our assumption. Thus each S, contains at least two 1’s, as desired. = 
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Next consider the factors whose multipliers are equal to 1. If the factor f of S. 
has multiplier 1, then f appears in S;_, only as a factor, so it plays a relatively 
unimportant role in the creation of S;. This leads us to consider the truncation S' 
of S; formed by deleting all the multiplier-factor pairs of S; whose multipliers are 
1. For example, if S, = 6,1, 2,2, 1,3, 1,4, 1,5, 2,6, 1,7, then S$, = 6,1,2,2,2,6. We 
will see that there are rather few sequences that arise as the truncation of a 
sequence in a cycle, and so will be able to use truncation to classify the cycles. 

Assume S; is a sequence belonging to a cycle, and by Lemma 3 that S; has the 
form 


fo cee 
S;=m,,1m; 2 fir M; k, Ti, k, 


with m; ,; => 2 for all j. In studying S;, the first step is to establish a property 


similar to part 2 of Proposition 2. 


Lemma 4, In a cycle we have k; — 1 =|S'|/2 — 1 < L¥4,(m, ; — 1) = |S}_,|/2 for 
all i. In particular, |S}| < |Si_,| + 2 for alli. 


Proof: The first equality and the following inequality are trivial, since m, ; > 2. 
For the last equality, since m; , — 1 is the number of multipliers in S;_, with value 
f;,;, the number of multipliers in $;_, that are not equal to 1 is L,;,.(m; ; — 1). 
But the multipliers in S;_, are exactly the multipliers in S,_, that do not equal 1. 
Thus the sum equals the number of multipliers in S;_,, or |S}_,|/2. a 


Therefore, in a cycle either |S*| = |S;_,| for all 7, or |S;| = |Si_,| + 2 for some i. 
Since each multiplier in S‘ is larger than 1, if |.S;| =|S‘_,| then Lemma 4 shows 
that {m, ;: j = 2} consists of all 2’s except for possibly one 3, while if |S;| = |Sj_,| 
+2 then m, , = 2 for all 7 => 2. We next show that the first case corresponds to 
the one- cycles, and so the second case corresponds to the longer cycles. 


Proposition 5. Suppose {S;}, <;<, is a cycle such that |S;| =|S;,,| for all i. Then 
p = 1, so S, is actually a one-cycle. 


Proof: Since all the S,’s have exactly the same factors, it suffices to show that the 
multiplier of any particular factor is the same in all of the S,’s. Because the set of 
multipliers of S;_, is exactly the set of factors of S;, and we might as well assume 
S, is not 2,2, the only factors of S$; whose multipliers are not 1 are 1, 2,3, and m, 
where m is the multiplier of 1 in S,;_,. We concentrate on these factors. First, 
1 +(IS,| — |S])/2 is independent of i and gives the number of 1’s in S;. Thus 
1 +(IS;| — [Si 1) /2 =m, and each of the S,’s contains m 1’s. Next, we have seen 
that each {m, ;: j = 2} consists of all 2’s except for possibly one 3. Since the value 
of the sum L,.,(m, ; — 1) =|S;_,|/2 is independent of i, as m is, we see that 
each §,; must contain the same number of 2’s and the same number of 3’s. Finally, 
for m => 4, m occurs in each S;, exactly twice, as the multiplier of 1 and as a factor. 

a 


This proposition allows us to find the truncations of all the one-cycles. Except 
for S = 2,2, the set of multipliers of a one-cycle consists of 2’s, possibly one 3, and 
the multiplier m of 1. We point out the various cases and let the reader check the 
details. If m = 2 then S’ = 2,1, 3,2, 2,3, while if m = 3 then S’ = 3,1, 2,2, 3,3 or 
S' = 3,1,3,3 depending on whether 2 is a multiplier of S or not. Since m is at 
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least 2, the only other possibility is m > 4, and then S’ = m,1,3,2,2,3,2,m. We 
have proved 


Theorem 6. If S is a one-cycle, then S' is 2,2 or 3,1,3,3 or 2,1,3,2,2,3 or 
3,1, 2,2,3,3 orm, 1,3,2,2,3,2,m for some 4 < m. 


We next turn our attention to the cycles whose periods are longer than 1. From 
Lemma 4 and Proposition 5 we know that in such a cycle |S;| = |S;_,| + 2 = 4 for 
some /, and that the multipliers in S; all equal 2, except for possibly the multiplier 
of 1. In fact, since 2m = |S;| = |S;_,| = L7_ 1m, ; is a sum of 1’s, at least one 2, and 
the multiplier of 1, we see the multiplier of 1 in S, must be at least 3. If we write 


Si = M15 1, 2, Fiasts 2s Ti,k, 
for k; > 2, then 
Si4] = Mi41,19 1, K;, 2,2, Mm; 4 


for some m,;,,. Thus if |S;| =|S;_,|+ 2 for some i, then |S;,,| = 6. Because 
|S‘| < |S!_,| + 2 for all i, it must therefore be the case that |S‘_,| = 6, |S;| = 8, and 
[Sia = 6. 


Now write S; = m,1,2,a,2,5,2,c for m>=3 and some a,b, and c. Clearly 
S;., must have m — 1 multipliers equal to 1. Because |S}_,| = 6, we have |S,| = 
|S;_;| = 2(m + 2), so the number of I’s in S; is 1 +(IS,| — |S;])/2 = Gm — 1). Thus 
the multiplier of 1 in S,;,,; is m — 1. Similarly, the multiplier of 1 in $,;_, is m or 
m — 1, depending on whether |S,_,| is 6 or 8, so S, contains either two m’s or two 
(m — 1)’s. 

Before considering these two cases, we show how to construct S;,, directly 
from S‘. Any integer f # 1 appears as a factor in S;,, if and only if it appears as a 
multiplier in S;, and then its multiplier in S;,, is one more than the number of 
times it appears as a multiplier in S;. Lemma 3 shows that f = 1 also appear as a 
factor in S/,,. The next lemma shows how to compute its multiplier. 


Lemma 7. The number of 1’s in S, is 1 + X((m;,; — 1)(f;,; — 1) — 1), where the 
sum is over the multipliers of S;. 


Proof: Let m be the number of 1’s appearing as multipliers in S,. Then 


1+ EL ((m;- (60-9) -1) 


m;, ES; 
=l+m+ YO ((m,;- 1)(f,;- 1) - 1) 
m;,;€S8; 
m;, ;€S; m, ;ES; 
The two last sums both equal |S;_,|, and 1 + m is the number of 1’sin S,. a 


Consider again S$: = m,1,2,a,2,b,2,c. If S, contains two (m — 1)’s, with 
m> 3, then S; =m,1,2,a,2,b,2,m —1, for some a and b, and so S;,, = 
m—1,1,A,2,2,m, for some A. It is then clear that m cannot be 3. By 
Lemma 7, a + b=6,so a=2 and b = 4. Thus S; = m,1, 2, 2,2,4,2,m — 1 and 
m # 5. Using Lemma 7 again we see 

Si4, =m —1,1,4,2,2,m 
and S.,, =m,1,2,2,2,4,2,m—1=S,, 
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for m = 4 or m2 6, and so this is a two-cycle. Notice that when m > 6 the 
multiplier-factor pair 1, m must appear in S, and the pair 1,m — 1 in S,,,. 
If S, contains two m’s, then S; = m,1,2,a,2,b,2,m for some a and b, and so 
‘41 =m — 1,1, A,2,2,m for some A. By Lemma 7, a + b = 5,s0 a= 2, b =3 
and Si = m, 1,2, 2,2,3,2,m. Thus S;,, =m — 1,1,4,2,2,m. If m is not equal to 
5, we are led to one of the cycles above. If m does equal 5, then using Lemma 7 we 
have 


S' = 5,1,2,2,2,3,2,5 
', = 4,1,4,2,2,5 
S',, = 5,1,2,2,3,4 


and S:,, = S;, so this is a three-cycle. Notice that the pair 1,3 must appear in S,,, 
and 1,5 must appear in S,,,. We have proved 


Theorem 8. Suppose {Sj}; << 18 a periodic counting sequence with period p > 1. 
Then either p = 2 or p = 3. In fact, 
S, =m,1,2,2,2,4,2,m — 1 
Si =m —1,1,4,2,2,m 

S' = 5,1,2,2,2,3,2,5 
ii. If p = 3 then the truncated form of {S;} is \ S, = 4,1,4, 2, 2,5 

S', = 5,1,2,2,3,4. 


i. If p =2 then the truncated form of {S,} is 
with m = 4 orm > 6 


It is a simple matter to rebuild a cycle from its truncation just by 
picking reasonable factors. For instance, if S’ = 3,1,2,2,3,3 then S$ = 
3,1, 2, 2,3,3,1,4,1,5 gives .a one-cycle. Of course, so does S$ = 
3, 1,2, 2,3,3,1,5,1,20 and, if we expand our possible choice of factors, so does 
S = 1, —4,1,0,3,1,2,2,3,3. In the sequel it will be useful to allow 0 as a factor. 

Notice that no three-cycle can contain more than seven factors, or more than 
two factors larger than 5. Thus, if S$, contains eight or more distinct numbers, or 
two or more distinct numbers larger than 5, then the cycle to which its counting 
sequence converges cannot have period 3. Nor can it converge to any one-cycle, 
except for one whose truncation has the form m, 1,3, 2, 2,3, 2, m. Since most finite 
sequences S, contain three different numbers larger than 5, we see most counting 
sequences converge to a one-cycle of the form m, 1, 3, 2, 2,3, 2, m, or to a two-cycle. 
In fact, the multiplier of 2 can be used to distinguish between these last two cases, 
but unfortunately we do not have methods to predict the multipliers of 2. It would 
be quite interesting to have a more precise answer. 

Similarly, we would like to have a method to determine the pre-period of a 
given Sy, that is, to be able to measure how far S, is from entering a cycle. 


4. CYCLES IN A FINITE BASE. From Theorem 6 the numerical portion of an 

answer Raphael Robinson’s puzzle is 
1,0,1,7,1,3,2,2,3,1,4,1,5,1,6,2,7,1,8,1,9. 

Is this answer unique? No, there is another: 
1,0,11,1,2,2,1,3,1,4,1,5,1,6,1,7,1,8,1,9, 


if we read 11 as two 1’s. This makes sense only if we represent the value eleven as 
1-10'+1-10°, iie., if we write our numbers in base 10. This example reveals the 
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basic and interesting difference between the counting sequences over finite bases 
and those over the infinite base: when we use a finite base the multipliers can 
consist of multiple digits, which, by definition, is impossible over the infinite base. 

What cycles are possible if we must choose the factors from the digits 0 through 
k — 1 and consider the multipliers in base k? If the factors are kept smaller than 
k, then Theorems 6 and 8 provide examples of cycles in base k. In base 5, for 
instance, 


1,0,3,1,1,2,3,3 and 2,1,3,2,2,3,1,4 


are one-cycles. However (11),, 1, 1,2, 1,3, 1,4 is also a one-cycle in base 5, where 
(11), is the representation of the number six in base 5, so these theorems do not 
list all of the cycles. It thus remains for us to find the cycles that contain at least 
one multiplier with multiple digits in base k. 

We first show that given any sequence S, the counting sequence {S,}.. ) formed 
in base k is eventually periodic. As in Section 2, it suffices to show that S,, i> 1, 
can take on only finitely many forms. We now write |S,|, for the total number of 
digits appearing in S$, when it is written in base k. For example, |(11)3, 1, 1,2|3 = 5. 


Lemma 9. In base k = 4, we have |S;|, < 2k + 1 for all sufficiently large i. 


Proof: We simply show that if |S;_,|, <|S,l, for some i, then |S,|, < 2k + 1. Asin 
Proposition 2 we have |S;_,|, = L;,,m,, where the m,’s are the multipliers of S;. 
Letting #m, be the number of digits of m, in base k, we then have 


|S;l, = the number of factorsin S;+ )) #m,;<k+ LY #m;. (4.1) 


m,ES; m,eS, 
Using Ym, = |S,_, lz < |S.le, we see that 
J i-1 l 


» (m; — #m,) <k. (4.2) 


m,€S; 


But m; — #m, is at least k — 2 if m,>k = 4. Thus, for k => 5 there can be at 
most one multiplier of S$, consisting of multiple digits in base k, and its value can 
be at most k + 2. When k = 4, one shows easily that a sequence in a counting 
sequence with two multipliers larger than 3 must have multipliers 4 = (10),, 
4 = (10),, 1, and 1, and its counting sequence converges to 1,0, (11),, 1, 1, 2, 2, 1, 3. 
Therefore |S,|, < 2k + 1, as desired. a 


Thus, when & is at least 4 there are only finitely many sequences that may 
appear in any given counting sequence. Inequality (4.2), which holds in any base, 
shows that 5 is the largest possible value of a multiplier in a counting sequence in 
base 2 or 3, and so also over these two bases a sequence in any given counting 
sequence may take on only finitely many forms. Therefore, all counting sequences 
in base k are eventually periodic for all k. 

Checking the possibilities, which we leave to the reader, in base 2 the only 
cycles are (11),, 1 from Theorem 8, and (11),,0,(100),,1. In base 3, Theorem 6 
gives only 2,2, while Theorem 8 gives three one-cycles. The only other one-cycles 
in base 3 are 


(10)3,0, (10)3,1, 2,2 and 2,0, 2,1, (10)3,2 and (10)3, 0, (10)s, 1, 
S, = 1,0, (10)3, 1, (10)3, 2 
and the only longer cycle is { S, = (10)3,0, (11)3, 1, 1,2 
S, = 2,0, (12)3, 1, 1,2. 
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Now suppose < is at least 4, and, by Lemma 9, that S$, is a sequence with one 
multiplier M such thatk < M<k + 2.If S, has f factors, then $,_, has at most f 
factors, and since they each have no more than one multiplier with two digits, we 
see that |S,_,|, <|5S;,|,. Thus inequality (4.2) may be more accurately stated as 


k-2< )) (m,— #m,) < (ISjle — 1)/2 <k. (4.3) 


més, 


If M=k + 2, then all of the other multipliers in S$; must equal 1, and |S,|, = 
2k + 1. One then sees that S,,, has the form 


S,., = 1,0,(11),,1,2,2,1,3,...,1, &1, (4.4) 


which constitutes a one-cycle. If M =k + 1, then the other multipliers in S$, equal 
1, except for possibly one 2. When 2 is a multiplier of S,, we have |S,|, = 2k + 1 
and then S,,, is as given in (4.4). When 2 is not a multiplier in S,, then 
S.l, = 2k — 1 and 


S.., = 1,0,(11),,.1,1,2,...,1,)...,1,k-1 


for some 0 </ <k —1,/ #1, where 1,1 means that this pair does not appear in 
S;,,. Clearly S,,, forms a one-cycle. Finally, if M =k = (10),, then it is not 
difficult to use part 2) of Proposition 2 to show that {S,},,) converges to a 
one-cycle consisting of terms having one base k digit each. We have proved 


Proposition 10. The only cycles in base k = 4 that have multipliers with two or more 
digits are one-cycles. Further, if S' is the truncated version of one of these sequences, 
then S' is either (11),,1 or (11),, 1, 2, 2. | 


We have now discovered all possible cycles in base k. Since k is finite, the 
number of cycles is finite and can be counted. 


Theorem 11. For k > 4, the number of one-cycles in base k is 2*~* + k(k — 1)/2. 
In bases 4 and 5 there are no longer cycles, while in base k > 6 there are 2*~° — 1 


two-cycles, (‘ ; *) three-cycles, and no longer cycles. 


The proof of Theorem 11 is just a matter of undoing the truncation process, and 
then using the binomial theorem. For example, for each m > 4 and k = 6 there 


k-4 
m-1 


are ( 7 one-cycles S$ with S’ = m,1,3,2,2,3,2,m, so there are 


 tog)oa reo 


mad m— I 


one-cycles § with S' having the form m, 1,3, 2,2,3,2,m. We leave the rest of the 
proof to the reader. 


5. INFINITE SEQUENCES AND INFINITE CYCLES. From Theorem 1 we know 
that every counting sequence beginning with a finite sequence S, is ultimately 
periodic. Is this true when S, is an infinite sequence? In this section we show that 
it is not and provide two methods for constructing counter-examples. 

If one chooses an infinite sequence S, at random, its associated counting 
sequence may fail to exist. For example, if we choose S) = 1, 2,3,4,5,6,..., then 
S, = 1,1, 1,2, 1,3, 1,4, 1,5,1,6,..., but S, is not well-defined and so {S,}., 5 does 
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not exist. It would be interesting to have necessary or sufficient conditions on Sy so 
that its counting sequence exists. We will not concern ourselves here with general 
existence and convergence questions but instead concentrate on supplying a variety 
of examples. 

We begin by constructing infinite sequences whose associated counting se- 
quences are actually one-cycles. First, let S} = 4,4, and define Sj = 4,4, 4,5, 4, 6. 
Notice that there are four 4’s in Sj, which fits the description given in S?. Next, 
create S¢ to fit the description given in Sj and to have consecutive factors, and 
then similarly create Sj by the description implicit in Sj: 


So = 4,4,4,5,4, 6,5, 7,5, 8,5, 9, 6, 10, 6, 11,6, 12 

S3 = 4,4,4,5, 4, 6,5, 7,5, 8,5, 9, 6, 10, 6, 11, 6, 12, 7, 13, 7, 14, 7, 15, 7, 16, 8, 17, 8, 
18,8, 19, 8, 20,9, 21,9, 22, 9, 23,9, 24, 10, 25, 10, 26, 10, 27, 10, 28, 10, 29, 11, 
30, 11, 31, 11, 32, 11, 33, 11, 34, 12, 35, 12, 36, 12, 37, 12, 38, 12, 39. 

Finally, define S, to be the limit of the finite sequences {SX}. It is then clear that 

S> forms a one-cycle, and so each element of the counting sequence {S,} exists. We 


adopt the terminology of [3] to call the process that takes the finite sequence S$ 
and produces the infinite sequence S, the self-generating process. 


Proposition 12. Let S = m,, f,,m),f,,...,m,,f,, be a sequence of positive integers 
such that the f, are strictly increasing, f, appears no more than m, times in S, and each 
m, also appears as an f,. Then, setting S) = S, the sequences S$; can be constructed 
using the self-generating process, S, = lim,S{¢ exists, and S, forms a one-cycle. 


To give the relative sizes of the factors and multipliers of our particular example 
S, = 4,4,4,5,4,6,... we introduce an integer sequence constructed and studied 
first by Golomb [3]. This sequence 

1,2,2,3,3,4,4,4,5,5,5, 6,6, 6,6,7,7,7,7, 8, a 
consists of the values of the function G(m) defined on the natural numbers by 
Gi) Gd) =1 

Gi) Gin) = A{integers m : G(m) = n} 

Gii) G() is non-decreasing. 

Golomb proved the asymptotic formula G(n) ~ d(n/¢)®"', where ¢ = 
(V5 + 1)/2 is the golden ratio. If we replace the first three terms of Golomb’s 
sequence by a 3, and then add 1 to each term, the resulting sequence consists of 


the multipliers of S,. Thus, the multiplier of f in S, is approximately G(/). 
Inverting the asymptotic formula for G(f) then gives 


Proposition 13. Let m, be the multiplier of f in Sy = 4,4,4,5,.... Then 
p 
my 
yo)" 
p 
It is a simple matter to modify S, to create counting sequences consisting of 
infinite sequences that converge to longer cycles. For instance, define §,(24) to be 
the sequence that is identical to S, except that the multiplier-factor pair 9, 24, is 
replaced by 10, 24, 1e., 
S9(24) = 4,4, 4,5, 4, 6,5, 7,5, 8,5, 9, 6, 10, 6, 11, 6, 12, 7, 13, 7, 14, 7, 15, 7, 16, 8, 17, 
8,18, 8, 19, 8, 20, 9, 21, 9, 22, 9, 23, 10, 24, 10, 25, 10, 26, 10, 27, 10, 28,... . 


314 THE LONG AND THE SHORT ON COUNTING SEQUENCES [April 


We have underlined the multiplier-factor pairs of S)(24) that do not agree exactly 
with Sp), i.e., the positions of $,(24) that are in “error” when compared to Sp». If 
S,(24) is the usual description of the sequence S,(24), then S,(24) contains one 
more 10 but one fewer 9 than S, contains, so 


S,(24) = 4,4, 4,5, 4, 6,5, 7,5, 8,4, 9, 7, 10, 6, 11, 6, 12, 7, 13, 7, 14, 7, 15, 7, 16, 8, 17, 
8,18, 8, 19, 8, 20, 9, 21, 9, 22, 9, 23,9, 24, 10, 25, 10, 26, 10, 27, 10, 28, ---, 

and 

S,(24) =5,4,3,5,3,6,6,7,5, 8,5, 9, 6, 10, 6, 11, 6, 12, 7, 13,7, 14, 7, 15, 7, 16, 8, 17, 
8,18, 8, 19, 8, 20, 9, 21, 9, 22, 9, 23,9, 24, 10, 25, 10, 26, 10, 27, 10, 28, -:-. 


Notice that the multiplier-factor pair in error in S,(24) has been “repaired” in 
S,(24), and that the errors in $,(24) are repaired in §,(24). Also notice that the 
numerical values of the multipliers in error in S, and factors in error in S$, are very 
close. The same is true for the multipliers in error in §, and the factors in error in 
S,. If we continue, the counting sequence converges to 


Sio(24) = 1,1,4,2,2,3,2,4,5,5,4,6,5,7,5,8,5,9,6,10, --- 
S1(24) = 2,1,3,2,1,3,3,4,5,5,4,6,5,7,5,8,5,9,6,10, --- 
S1,(24) = 2,1,2,2,3,3,2,4,5,5,4,6,5,7,5,8,5,9,6,10, 
S13(24) = S19(24) = 1,1,4,2,2,3,2,4,5,5,4,6,5,7,5,8,5,9,6,10, + 


So we have constructed an example of an infinite three-cycle. 

We can abstract two facts from this example. Suppose we create an infinite 
sequence S,(f) that is identical to an infinite one-cycle S, except that the 
multiplier of f in S, has been increased by one. Then, (1), at the beginning of the 
counting sequence {S,(f)},. the errors move quickly to the “left”, and (2) once 
the errors have reached the beginning of the sequences, they (relatively) quickly 
settle into a cycle. It is not too difficult to convince oneself of these facts, because 
Proposition 13 tells us that a multiplier in S$, is far smaller than its factor. 

More generally, one may define S,(/;, f,,...,f,,) to be identical to an infinite 
one cycle Sy except that the multipliers of the f;’s in Sy have been increased by 
one, and then consider the counting sequence {S,(f,, f,,..., fis 9. Each such 
counting sequence ends in a cycle. It would be interesting to classify the cycles that 
arise in this manner. 

Using these ideas we can describe the construction of an infinite counting 
sequence that is not ultimately periodic. For a given integer f; let n,; be the 
pre-period of {S,(f,)};. 9. That is, S,(f;) is part of a cycle if i >; Choose an 
infinite set of factors f;, j = 1, growing fast enough in j so that for all 1 <n; and 
f <f,-1, the multipliers of the factor f in Sy) and S,(f,) are equal. In other words, 
choose f; so that it takes more than n, steps for the errors in S,(f;) to move 
themselves to the point of the initial error in S)(f;_,). Once we fix such an infinite 
sequence, then {S,(f,, f, fs, °°* };5 9 will be a non-periodic counting sequence. To 
actually construct such a family of f,’s one needs to use Proposition 13 to give a 
careful study of the rates at which the errors in {S,(f,)} spread and move to the 
left. 

As this study would occupy the better part of several pages, we instead end this 
section with a very simple method for constructing infinite counting sequences that 
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are both well-defined and not ultimately periodic. Define {S,},, , by the following 
rules: 


(i) The multiplier of i in S; has value at least i. 

(ii) Every natural number occurs as a factor in each S,. 
(iii) The multipliers in each S, form a non-decreasing sequence. 
(iv) S,,, is the description of S$, for i = 1. 


Rule (i) insures that the terms below the main diagonal are not influenced by those 
above the main diagonal. For instance, taking the multiplier of 7 to be i + 1 gives 
the following: 


S, = 2,1,2,2,3,3,3,4,4,5,4,6,5,7,5, 8,5, 9, 6, 10, 6, 11, 6, 12, 7, 13,7, 14,7, 15, --- 
S, = 1,2,3,2,3,3,3,4,4,5,4, 6,4, 7,5, 8,5, 9,5, 10, 6, 11, 6, 12, 6, 13, 7,14, 7,15, -- 
S;=1,1,2,2,4,3,4,4,4,5,4,6,5,7,5, 8,5, 9,5, 10, 6, 11, 6, 12, 6, 13, 6, 14,7, 15, -- 
S, = 2,1,2,2,1,3,5,4,5,5,5,6,5,7,5, 8, 6, 9, 6, 10, 6, 11, 6, 12, 6, 13, 7, 14,7, 15, --- 
S; = 2,1,3,2,1,3,1, 4, 6,5, 6, 6, 6, 7, 6, 8, 6, 9, 6, 10, 7, 11, 7, 12, 7, 13, 7, 14,7, 15, --- 
S¢ = 3,1, 2,2,2,3,1,4,1,5,7,6,7,7,7,8,7,9,7, 10,7, 11, 7,12, 8, 13, 8, 14, 8, 15, --- 
S, = 3,1,3,2,2,3,1,4,1,5, 1, 6, 8,7, 8, 8, 8, 9, 8, 10, 8, 11, 8, 12, 8, 13, 8, 14, 9, 15, --- 
Sg = 4,1,2,2,3,3,1,4,1,5,1,6,1,7,9,8,9,9,9, 10,9, 11,9, 12,9, 13,9, 14,9,15,---. 


In S,,, the multiplier of i is either 2 or 1, depending on whether the multiplier of 
iin S, is i or greater than i. Therefore {S,}., , is well-defined but is not ultimately 
periodic. 


6. FACTOR-FREE COUNTING SEQUENCES. We end this paper the way we 
began it: by using the sequence 2,1,1,4 to build a type of counting sequence. 
Because 2,1, 1,4 consists of 2 ones, 1 two, 0 threes, and 1 four, let us define R, to 
be the numbers making up this description: R, = 2,1,0,1. Repeating this process, 
R, consists of 1 zero, 3 ones, 1 two, 0 threes, and 0 fours, so set R, = 1, 2,1, 0,0. 
Continuing we have 


R, = 2,2,1,0,0 
R, = 2,1,2,0,0 
R, = 2,1,2,0,0. 


We call the sequence {R,},., a factor-free counting sequence. The cycles of 
factor-free sequences are called self-descriptive and co-descriptive strings in [5], [7], 
and [8]. 

Since a factor-free counting sequence is built without the explicit benefit of 
place-keeping factors, we need a method for indicating which integer each term in 
each R, describes. For i > 1 we assume that the j-th entry of R, gives the number 
of times j — 1 appears in R,_,, and that this entry is 0 if j — 1 does not appear in 
R,_, but some integer at least as large as j — 1 appears in some R;, 1 <i’ <1. 
Then, just as the first number in an element S, of a counting sequence almost 
always describes the number of 1’sin S,_,, the first number in an element R, of a 
factor-free counting sequence describes the number of 0’s in R,_,. 

Of course, one may allow the first digit of a sequence to describe numbers other 
than 0. The only finite example of this is the one-cycle 1, which is the factor-free 
version of the one-cycle 2,2. There are, however, many infinite examples. For 
instance, Golomb’s sequence can be thought of as an infinite factor-free one-cycle 
that begins by describing the number of 1’s it contains. 
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Using techniques similar to those in Section 2 it is easy to show that if Ry is a 
finite sequence of non-negative integers, then the factor-free counting sequence 
{R,},. 9 is ultimately periodic. To find all of the possible cycles we will relate the 
factor-free and “ordinary” counting sequences. Following [1], say that an element 
5, of a counting sequence is complete if its factors are consecutive and the smallest 
factor is 1. If S =m,, f,,m,,f,,...,m,, f, is a complete element of a counting 
sequence, then defining R =79,7,...,n, by nj; =m,,, — 1 gives a factor- 
free sequence. Similarly, given a sequence R=7n),n,,...,n, Of a factor-free 
counting sequence, defining S by f; =j +1 and m; =n,_, + 1 gives a factor- 
containing sequence. Notice that the sequence S$ corresponding to R is complete. 
While it is not true that this process allows one to convert between counting 
sequences {S,},., and factor-free counting sequences {R,},,,, it is very easy to 
show that there is a one-to-one correspondence between the cycles of factor-free 
counting sequences and the cycles of complete counting sequences. Since there is 
also a one-to-one correspondence between complete cycles and the truncations 
appearing in Section 3, Theorems 6 and 8 give our final result. 


Corollary 18. Other than 1, the cycles of factor-free counting sequences all contain 
zeros, and have length one, two, or three. The one-cycles are 2,0,2,0 and 1,2, 1,0 
and 2,1,2,0,0 and m + 3,2,1,(m0’s),1,0,0,0 form > 0. The two-cycles are 


3,1,1,1,0,0 m + 3,1,0,1,(m0’s), 1,0 
Een oe *' m + 2,3,0, 0, (m0’s), 0, 1, 


form => 2. Finally, the only other cycle of any length is the three-cycle 
4,1,1,0,1,0,0 
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Characterizing Continuity 


Daniel J. Velleman 


It is well known that for any function f: R — R, f is continuous in the «-6 sense if 
and only if the inverse image under f of every open set is open. Recently, when I 
was teaching a topology class, a student’ asked if continuity could be characterized 
using images instead of inverse images. Of course, we cannot simply replace 
“inverse image” with “image” in the preceding characterization, but perhaps the 
characterization would work if we used some family of sets other than the family of 
open sets. Thus, we are led to consider the following question: Does there exist a 
family Y of sets of reals such that for every function f: R — R, f is continuous if 
and only if for every X € F, f(X) © F? 

My initial reaction was that the answer was probably “no,” but I found it 
surprisingly difficult to prove the “no” answer. Perhaps the reason that the “no” 
answer is so difficult to prove is that the answer is almost “yes.” More precisely, we 
have the following characterizations of continuity: 


Theorem 1. There is a family F of sequences of real numbers such that for every 
function f: R > R, f is continuous if and only if for every sequence {x,}°_, © F, 
{f(x} _4 EF. 


Theorem 2. There are families F and @ of sets of reals such that for every function f: 
R > R, fis continuous if and only if for every X € F, f(X) € Fand for everyX € F, 
fXOEY. 


Proof of Theorem 1: Let FY be the set of all convergent sequences. If f is 
continuous and {x,}°_, converges to x, then {f(x,)}’_, converges to f(x). For the 
converse, suppose f is discontinuous at x. Then there is some sequence {y,}"_, 
converging to x such that {f(y,)}°_, does not converge to f(x). Define a sequence 
{x,}°_, letting x,, =y, and x,,_, =x. Then {x,}?_, converges to x, but {f(x,,)P_, 
does not converge. = 


Proof of Theorem 2: Let ¥ be the set of all connected sets (i.e., the set of all open, 
closed, and half-open intervals), and let Y be the set of all compact sets. Clearly 
both Y and &Y are closed under continuous images. Now suppose f is not 
continuous, and the image under f of every connected set is connected. Choose 
numbers x and e>0 such that for every 6>0 there is some y such that 
Ix — y| < 6 but |f(x) — f(y)| = e. For each positive integer n choose a number y, 
such that |x — y,| < 1/n but |f(x) — f(y,)| = e. Then either f(y,) = f(x) + e or 
fly,) <f() — e. In the first case, since the image under f of the interval from x 
to y, is connected, there must be some x, between x and y, such that f(x,) = 


‘Actually, the “student” was Amherst College Philosophy professor Alexander George, who was 
auditing the class. I would like to thank him for suggesting this stimulating problem. 
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f(x) + 60/2 + 1/(v + 1)). Similarly, if f(y,) < f(x) — « then we can choose x, 
between x and y, such that f(x,) = f(x) — «7/2 + 1/(m + 1)). Now let X = 
{x, ln © Z*} U {x}. Then since {x,}°_, converges to x, the set X is closed and 
bounded, and therefore compact. But either f(x) + ¢/2 or f(x) — «/2 is a limit 
point of f(X) that is not an element of f(X), so f(X) is not closed and therefore 
not compact. a 


Theorem 1 can be generalized to any metric space (see [3, Theorem 9.3.1]) or 
even any first countable Hausdorff space. Theorem 2 seems harder to generalize. 
The proof we have given generalizes to R”, but not, for example, to the set Q of 
rational numbers with the topology induced by the usual metric. To see why, 
define f: Q > Q by letting f(x) = 0 if x < Oand f(x) = 1if x = 0. Since the only 
nonempty connected sets in Q are singletons, the image under f of every con- 
nected set is connected, and the image of every set is finite and therefore compact. 
However, f is not continuous. 

Despite Theorems 1 and 2, it turns out that the answer to our original question 
is indeed “no.” In fact, we have the following slightly stronger theorem: 


Theorem 3. There do not exist families of sets of reals F¥ and & such that for every 
function f: R > R, f is continuous if and only if for every X © F, f(X) € &. 


Proof: Suppose there were families ¥ and & as in the theorem. We will draw 
several conclusions about ¥ and Y that will eventually lead to a contradiction. 
Note first that if @ © then since f(@) = © for every function f, Oe FZ. It 
follows that if f is discontinuous then there must be a nonempty set X © FY such 
that f_X) € &. In particular, FY must contain at least one nonempty set. 


Claim I. The family ¥ contains all singletons. 


Proof: Let X be any nonempty element of ¥. Then all continuous images of X 
must be in &. Since all constant functions are continuous, this means that # 
contains all singletons. 


Claim 2. The family ¥ contains no two-element sets. 


Proof: Consider any two distinct real numbers, a and b. Let f: R > R be defined 
as follows: ) 


_fa tx<0 
F(x) = {4 if x > 0. 


Then f is discontinuous, so there must be some nonempty set X € ¥ such that 
f(X) € &. But by Claim 1 f(X) can’t be a singleton, so the only other possibility is 
that f(X) = {a, b} and therefore {a, b} € &. 


Note that the set X in the proof of Claim 2 must have more than one element, 
since f(X) = {a, b}. Since all continuous images of X are in Y, it follows that 7 
must contain some sets with more than one element. 


Claim 3. lf X € ¥,a,b € X, anda<c<d<b,then XN (c,d) #©@. 
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Proof: Define a function f: R > R as follows: 


0) ifx<c 
X-Cc | 

f(x) = que ifc<x<d 
1 ifx > d. 


Then f is continuous, so f(X) € ¥. But if XN (c,d) = © then fCLX) = {0, 1}, 
and by Claim 2 this cannot be an element of ¥. Therefore X M (c,d) # ©. 


In the rest of the proof we will use sets related to the Cantor set. Recall that to 
construct the Cantor set we begin with the closed interval [0,1] and remove the 
open middle third (1/3, 2/3). Then we remove the open middle thirds from the 
two remaining intervals, and then remove the open middle thirds of the remaining 
intervals, and so on. The Cantor set is what remains after infinitely many such 
steps. Alternatively, the Cantor set can be described as the set of all numbers in 
the interval [0,1] that can be written in base 3 using only the digits 0 and 2. 


Claim 4. If Y€ Y and Y has more than one element then Y is uncountable. 


Proof: Suppose Y€ ¥, Y has more than one element, but Y is countable. 
Enumerating the elements of Y, we can write Y = {y,, y., y3,...}, where the y,’s 
need not all be distinct, but they are not all identical. 

We now carry out a construction similar to the construction of the Cantor set, 
except that at each stage we remove closed middle thirds. We begin with the open 
interval (0, 1) and remove the closed middle third A, = [1/3,2/3], leaving the set 
B, = (0,173) U (273,1). Then we remove the closed middle third set A, = 
[1/9, 2/9] U [7/9, 8/9] from B,, leaving B, = (0;1/9) U (2/9, 1/3) U (273, 7/9) 
U (8/9, 1). In general, for every n, B, will be a union of 2” disjoint open intervals, 
each having length 1/3”. We let A,,, be the union of the closed middle thirds of 
all of these intervals, and we let B,,, = B,\A,41- 

Let {a,}?_, be a sequence of positive integers such that every positive integer 
occurs infinitely many times in the sequences. For example, we could use the 
sequence 1,1, 2,1, 2,3,1,2,3,4,... . Now define a function f: R — R as follows: If 
x €A, for some n, then let f(x) =y,. Otherwise let f(x) =y,. Clearly f is 
discontinuous, since the range of f is Y, which is a disconnected set. Thus, there 
must be some nonempty set X €F such that f(X) € Y. We will show that 
f(X) = Y, which will contradict the fact that Y € %. 

By Claim 1, f(X) is not a singleton, so we can choose x,,x, © X such that 
f(x,) # f(x,). Thus f(x,) and f(x,) are not both equal to y,, so we may assume 
without loss of generality that f(x,) # y, and therefore x, € A,, for some m. By 
definition, A,, is a union of disjoint closed intervals, and x, is in one of these 
intervals. Let [c,d] be the component of A,, containing x,. Since f(x,) # f(x,), 
x, € [c,d], so either x, <c or x, >d. The two cases are very similar, so we 
consider only the case x, <c. 

Choose 1 = m large enough so that x, < c — 1/3”. Then B, is made up of 2” 
open intervals, one of which is (c — 1/3”",c), and x, <c — 1/3" <c <x. It is 
clear from the construction that for every t >n, at least one of the intervals 
making up A, is contained in the interval (c — 1/3”,c). Therefore, by Claim 3, X 
contains an element of A,, so f(X) contains y,. Since this is true for every t > n, 
and every positive integer occurs infinitely many times in the sequence {a,}”_,, it 
follows that f(X) = Y. This completes the proof of Claim 4. 
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The last step of the proof is motivated by the fact that there is a continuous 
function, the Cantor-Lebesgue function, that takes on a different constant value on 
each of the countably many intervals making up the complement of the Cantor set; 
see [2, pp. 37-38]. It follows that any element of ¥ that is not completely 
contained in one of these intervals must contain at least one element of the Cantor 
set, since otherwise its image would be an element of Y that is countable but not a 
singleton, contradicting Claim 4. We will apply similar reasoning to a family of sets 
related to the Cantor set. 

To define the sets we will be interested in it will be convenient to use base-3 
notation. We will say that a function mapping the positive integers to the set 
{0,1,2} is a digit sequence. If m is an integer and f is a digit sequence, let 
Xin, f =m + Y;-1 f(i)/3'. Note that if f and g are digit sequences, f(i) = g(i) for 
all i < j, and f(j) < g(j), then x,, ¢<X,., With equality holding if and only if 
f(i) = 2 for all i > j, g@) = 0 for all i> j, and g(j) = fj) +1. 

Suppose A is a set of positive integers that is both infinite and co-infinite (.e., 
has infinite complement). Motivated by the fact that the Cantor set contains those 
numbers that can be written in base 3 using only the digits 0 and 2, we define a 
digit sequence f to be Cantor-like on A beyond N if for every n > N, f(n) = 1 if 
and only if n € A. We will say that f is eventually Cantor-like on A if it is 
Cantor-like on A beyond N for some positive integer N. Let 


X4 = { Xm, yim is an integer and f is a digit sequence that 
is eventually Cantor-like on A}. 


Note that, although some numbers have more than one base-3 expansion, the fact 
that A is co-infinite guarantees that every element of X, has infinitely many 1’s in 
its base-3 expansion, and therefore that this expansion is unique. In other words, if 
x € X, then there is a unique integer m and a unique digit sequence f such that 
x =X. Note also that X, is dense in R, and if B is another infinite, co-infinite 
set of positive integers such that A\B is also infinite, then X, and X, are 
disjoint. 


Claim 5. For every infinite, co-infinite set of positive integers A and every X € F, 
if X has more than one element that XN X, # ©. 


Proof: Suppose that A is an infinite, co-infinite set of positive integers and X € F, 
and suppose also that c,d € X and c < d. Choose an integer m, a positive integer 
N, and a function g: {1,2,..., NM} — {0, 1, 2} such that 

N i 

em4 8), y ) + 

in. 2 "3 
and note that for every digit sequence f on dine 8,0 <X_ p< d. We will say 
that a digit sequence f is good if f extends g and f is Cantor-like on A beyond 
N. Let 


<d, 


W = {Xm, lf is a good digit sequence}. 
Then W CX, and W C(c,d). We will show that X N W # ©. 

Let A\ {1,2,...,N} = {a ay, a3,...}, with a, <a, <a, < -::. Imitating the 
definition of the Sate pebeseue function, we now define a function F: W => [0, 1] 
as follows: If x € W then x =x,, , for some unique good digit sequence f. Note 
that f is determined by its values on the a,’s, and for every i, f(a;) is either 0 or 2. 
Let F(x) = Y3_,(f(a;)/2)/2!. Then F is a nondecreasing function whose range is 
the entire interval [0, 1]. 
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Now define F: R > [0, 1] by the formula 


F(x) = sup({0} U {F(y)ly € W, y <2x}). 

Then F extends F and F is a nondecreasing function mapping R onto [0, 1], so it is 
continuous. We claim that for every x € W, F(x) is rational. To see why, consider 
any x € W. Since F maps W onto [0,1], we can choose y € W such that 
F(y) = F(x). By the definition of W, y =x,, , for some good digit sequence /f. 
Since y € W and x € W, y # x, so either y <x or y > x. Suppose that y < x, and 
choose n large enough that y + 2/3% <x. If there is some i >n such nat 
f(a;) = 0, then let g be the same as f except that g(a;) = 2, and let z = 

Then z=y + 2/3% <y+ 2/3" <x, z © W,and F(z) = F(y) + 1/2'> Fy) 
F(x), contradicting the definition of F(x). Therefore for every i>n, f(a;) = 2. 


But then ’ 
_ 2 f(aj)/2 * f(a)/2 2 1 we f(a)/2 1 
) 0) 2 2" in. * oe, 2 it 2 2" 


which is rational, as required. A similar argument shows that F(x) is rational if 
y>x. 

Since F is continuous and X &%, it follows that F(X) <€ Y. Recall that we 
have c,d € X and W C (c,d), so F(c) = 0 and F(d) = 1. Therefore 0,1 € F(X), 
so F(X) is uncountable by Claim 4, and therefore F(x) is irrational for some 
x © X. But as we have already observed, F maps every number not in W to a 
rational number, so x € W. Thus x € XN W,so X N W # ©. This completes the 
proof of Claim 5. 


Fix a bijection A from the rationals to the positive integers. For every real 
number r, let A, = {h(q)|q is a rational number and gq < r}, an infinite, co-infinite 
set of positive integers. If r<s then, since there are infinitely many rational . 
numbers between r and s, A, \A, has infinitely many elements. It follows that 
X, and X, are disjoint. 

Choose a set Y € ¥ with more than one element, and let yq be any element of 
Y. Define a function f: R > R as follows: If x © X, for some r € Y, then let 
f(x) = r. Otherwise, let f(x) = yo. Clearly f is discontinuous, since for every r & Y 
it takes on the value r on a dense set. However, we will show that for every 
nonempty X € F, f(X) € F. This will contradict the choice of Y and Y and will 
therefore complete the proof of Theorem 3. To prove our last assertion, suppose 
X €¥and X is nonempty. If X is a singleton then so is f(X), so f(X) € F by 
Claim 1. If not then by Claim 5, for every re Y, XV X, # ©. It follows that 
AX)=VYEY. a 


For generalizations of Theorem 3, see [1]. 
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Probabilistic Pursuits on the Grid 


A. M. Bruckstein, C. L. Mallows, and I. A. Wagner 


1. INTRODUCTION: PROBABILISTIC PURSUIT. The paths of a sequence of 
a(ge)nts engaged in a sequence of continuous pursuits converge to the straight line 
between the origin and destination [2]. We consider a discrete setting where the 
a(ge)nts are only allowed to visit grid points and chase each other according to a 
probabilistic rule of motion, and prove a similar result: the average paths of ants in 
a chain of probabilistic pursuit converge rapidly to a straight line. This discrete 
model of pursuit leads to interesting results also in the context of linear and cyclic 


pursuits. 
Assume that a sequence of ants A,,A,,A,,... are released from the origin at 
times t = 0,A,2A,..., (A being an integer > 1), and each ant moves on the 


integer grid in the plane so that A,,, chases or pursues A, according to a 
probabilistic rule defined in the sequel. For sake of simplicity, consider that each 
ant measures time from its moment of release: if A,,,, is at time ¢ of its motion 
(i.e., on the fth point of its trajectory), then A, is at time (¢ + A). A pursuing ant 
A,,., Stays one unit of time at a grid point A,,,(¢) = (x,4,(@) y,4,@)). Then 
it looks around, and decides where to move next according to the location 
A,(t + A) = (x,@ + A), y,(¢ + A)) of the pursued ant. Ant locations on the grid 
will be encoded as complex numbers: A,(t) = x,(t) + jy,(t), where j = V— 1. 

Probabilistic pursuit is defined by the following rule. A,,, chooses its next 
position as one of its four nearest neighbor-points on the grid, under a probability 
distribution determined by its relative position with respect to the pursued ant. 
Thus 


Anil + 1) =A, 41(t) + bn 41(4 + 1), (1) 
where 6,,,(-) are random variables taking values in {1, —1, j, —j} according to 
|d.,| 
Prob {5,,,(¢ + 1) = sign(d,)} = 7 
(2) 
_ Id, | 
Prob {8,41(t +1) =j: sign(d,)} = 


where d,,d, are defined as 
d., =X,(t + A) —Xn41(t) 
d, =y,(t + A) — Vn+i(t) 


and d = |dx| + |dy| is the “Manhattan distance” (the Manhattan norm of x + jy is 


defined as ||x + ill = lel + |lyl) between successive ants (see Figure 1). If d drops 
to zero at some time during A, ,,’s pursuit of A,, the ants merge and continue A,’s 
pursuit of A,_,. The preceding equations define a probabilistic pursuit in the 
complex plane, with pursuit steps biased according to the relative locations of the 
pursuer and pursued. The rule is trivial if A = 1, since then the pursuing ant 
follows the leader exactly. 
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Pr(up)= dy/d 


An+1 | ~4 


Pr(right)=dx/d 


d=dx+dy 


Figure 1. The probabilistic model for ant pursuits on Z7. 


Figures 2—4 display simulation results of probabilistic pursuits for various initial 
trajectories. In each of these simulations we ran many pursuits with identical 
trajectories for A,, starting at (0,0) and ending at some grid point (a,b). The 
figures show the distribution of locations visited by certain ants, the grey level of 
each pixel being proportional to the number of times the ant visited that location. 
The ensemble-averaged path of the sample ants is depicted as a bold curve. 


ant A2 


ant Aso ant Ag3 ant A400 


Probabilistic chain pursuit of 100 ants from (0, 0) to (20, 20) 
Gray level - Distribution of sites visited by sample ants 

Bold lines - the average path in 200 simulation runs 

Initial Manhattan distance = 5 


Figure 2. Probability distribution with a simple ’maze’ initial path. 
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ant Aro 


ant Ago ant A160 ant Ara9 


Probabilistic chain pursuit of 240 ants from (0, 0) to (20, 20) 
Gray level - Distribution of sites visited by sample ants 

Bold lines - the average path in 100 simulation runs 

Initial Manhattan distance = 5 


Figure 3. Probability distribution with yet another ’maze’ initial path. 


ant Ag3 ant A100 


Probabilistic chain pursuit of 100 ants from (0, 0) to (20, 20) 
Gray level - Distribution of sites visited by sample ants 

Bold lines - the average path in 200 simulation runs 

Initial Manhattan distance = 5 


Figure 4. Probability distribution with a self-crossing initial path. 


2. PATH CONVERGENCE TO STRAIGHT LINES. Assume that the first ant A, 
travels along an arbitrary grid path from the origin to a + jb, where it stops 
(without loss of generality we assume that a > 0, b > 0). Then, for each n > 0, 
A,4, pursues A, following the probabilistic pursuit rule given by (1) and (2). Let 
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us define L, as the (rectilinear) length of this path: 


Th, 
L,= LA. +1) — 4,(4) I, 
t=0 


which equals 7,—the total number of steps in the path of the nth ant. 

We shall show that the pursuit paths converge, in a sense, to the “straightest” 
line on the grid connecting the source 0 to the destination a + jb. This will be 
done in three stages: first we show that for any initial grid path taken by A, the 
pursuit trajectories eventually become confined to the rectangle defined by 0 and 
a-+jb, and are monotonic (of length a+b). Then we show that within the 
rectangle all monotonic paths have, in the limit, equal probability. This means that 
the points near the straight diagonal are more likely to be visited, and that the 
straight diagonal from 0 to a + jb is the average path in the limit. Then we show 
that the average path converges to the straight line very fast. 


2.1. The Pursuit Paths become Monotonic. We first show that the trajectory A,(t) 
eventually becomes monotonic. A discrete path is monotonic if it has no “back- 
tracking”—that is, 6(¢) € {1, j} for all ¢ during the pursuit. 


Lemma 1. L,, the Manhattan path-lengths of ants engaged in probabilistic pursuit, is 
a positive, non-increasing (hence convergent) sequence. 


Proof: Since T,, = L,,, we show the claimed properties for T,. Ant A,,,, starts its 
journey exactly A units of time after A, has started. After T, units of time, A, 
stops at the destination and at this point A,,, has made T, — A steps along its 
trajectory. According to the probabilistic pursuit rules, the distance between ants 
can never increase, hence when A, stops, its pursuer A,,, is at a distance <A 
away from the destination. In the following A, < A units of time, A,,,, decreases 
its distance from the destination by exactly one per unit of time. Therefore we 
have 
Lng = Tha. =T, ~-A +A, ST, -At+A=T, = L, 


and since the sequence L, is also bounded below by a + 5b, it converges. | 


We next claim that if the path-length of an ant is greater than a + Db, there isa 
positive probability that the path-length of the next ant decreases. 


Lemma 2 


A-1)\" 
Prob {L,,, <L, —2|L, >a+b}> | A | 
Proof: Since an ant starts at 0 and finally arrives at a + jb, it is clear that for all n 


we must have 
T. 


Y 6,(t) =a + jb. 

t=0 
From the definition of probabilistic pursuit we see that 6,(t) <= {+1, + j}, and if 
L, > a+b (as we assume) the path of A, is necessarily non-monotonic, that is: 
there exist times ¢,,¢, such that 6,(¢,) = —6,(t,). Let us take (¢,,t,) to be the 
earliest such interval, so that ¢, is the first time (after t,) when A, makes a 
“backtracking;” see Figure 5, in which we assume (without loss of generality) that 
at time ¢, the ant A, moves to the left, then up, and at time ¢, to the right. Since 
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we require ¢, to be the first “backtracking,” A, moves upwards monotonically 
between ft, + 1 and ¢, — 1, for h steps, where h = t, — t, — 2. Since A,,,(t) is A 
steps behind A,, at time ¢, it must be somewhere on the boundary of the square 
RVQPTSWR. Now from the figure it is clear under what conditions the distance 
between the ants decreases during the time interval (¢,, t,). This happens if either 
(i) A,,,(t) is located to the left of WT, in which case the distance decreases at 
time t¢,, or (ii) it is located to the right of V, in which case the distance decreases at 
time f,. Also, if A,,,(¢) is on SPQ the distance decreases sometime between 1, 
and t,. The only chance to preserve the distance is when A,,, happens to be 
located on the arc WRV at time ¢,; in this case A,,,, may first get to PR, and then 
follow A, one step to the left of PR and later (after ¢,) to the right, without ever 
Shortening the distance between them. However this is not sure to happen. 
Wherever A, ,, starts from, there is the possibility that after it reaches PR it never 
makes a step to the left between ¢, and ¢,. Let us denote by J the event “A,,,, 
once it has arrived on the line PR, stays there (at least) until time t,”. As explained 
previously 
Prob{L,,, < L,} = Prob{J}. 


To obtain a lower bound on the probability that J occurs, note that the probability 
that A,,, does not move left in a certain time in (¢,,t,), according to the 


Possible locations for [Nee 


when the M-distance A is given 


(f1)_ 


n A h 


tl 
why =A R 


Figure 5, An illustration of a non-monotonic ant path. 
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probabilistic pursuit rule, is proportional to the ratio of d, ((A — 1) in our case) to 
d, + d, (A in our case). The event of staying on the line PR should repeat t, — ¢, 


times (or fewer if A, ,, arrives on the line PR later than t,). Hence 
A-1\?""% 

; 3 

| (3) 


which is the probability that A,,, stays on the line PR during an interval that is 
not longer than (t,, t,), given that A, is hopping along the line TW. This effort by 
A+, iS eventually rewarded at time t,, when A, turns right and the distance 
decreases by 2. Clearly, 


Prob {I} >| 


t,-t,<7T,=L,<L); 


and hence the probability that the length of the (m + 1)st path is shorter than that 
of the nth path by two (or more) units is bounded below by ((A — 1)/A)”°. a 


Note that if the distance between ants A,,, and A, drops, it drops in quanta of 
two if A, is not stationary at a + jb. The proof of Lemma 2 also shows that 
chasing an ant that moves along a non-monotonic path induces a positive probabil- 
ity for a drop in the distance between the ants. 

The next theorem shows that the pursuit path eventually becomes monotonic: 
L, converges to a+b with probability 1. In general, a sequence of random 
variables {X,,} converges with probability 1 (or almost surely) to a value X (we write 
X, —> X) if, given €,6> 0, there exists an n,(e, 6) such that for all n > np, 
Prob{|X, — X| < 6} >1-—. 


Theorem 1. There exist constants k,,k, > 0 such that, given € > 0, if 


1 
n>no(e) =k, + k,toa{ =) 
€ 
then 
Prob{L, =a+b}>1-e, 


where L,, is the length of the path of A, in a probabilistic pursuit from the origin to 
a + Jb. 


Proof: If L, > a+b then there must have been at most s, =[Ly) — (a + b)I/ 
2 — 1 ants in the sequence Ay,...,A,, for which a drop (of 2) in the distance to the 
pursued ant occurred, since a decrease in the distance between consecutive ants 
implies a decrease in the path length of the pursuing ant. Hence, there were at 
least m — Sy ants with no decrease in distance. Lemma 2 ensures that each ant 
path can be viewed as the outcome of an experiment in which the distance-drop 
event occurs with a probability of at least p = ((A — 1)/A)”°. A sequence of ants 
engaged in a probabilistic pursuit is a series of trials, with outcomes that are either 
a ‘“‘success”—a drop in the inter-ant distance (which has a probability at least p), 
or a “failure’—the distance does not change. Define A to be the event “sy or 
fewer distance-drops in a chain of n ants”’. 

So 

Prob{L, > a+b} = Prob{ A} = )) Prob{s successes up to n} 
s=0 


cin (*u-p ens 


n—-Sg 


(1 —p) 


n 
SQ 


328 PROBABILISTIC PURSUITS ON THE GRID [April 


=(1-p)"¥ le —p)” 


n[ a 0 —s 
<(1=p)"(2] EC =p)* (form > 25) 
s=0 

<(1- "(Aaa <0 )"nC, = Cyq?n® 
Here C,,C,, and g <1 are constants, independent of n and e. Since 
lim, s«C,:q":n° = 0, there exist constants C,,C, such that 


foralln > C3, C,:q"-n@<C,-q"” 
and in order to get 
Prob {A} < Cy: q"”’ <e 
it is sufficient to have 


2 logc, 2 1 
n> ——7— + ——log| — |}. a 
1 1 E 
log— log— 
q q 


2.2. The Stationary Path-Distribution is Uniform. The paths followed by succes- 
sive ants form a Markov chain, with the state-space being all paths from the origin 
to a + jb. Theorem 1 ensures that all paths longer than m = a + BD are transitory. 
If we restrict to paths of length exactly m, we shall show that the chain is 
irreducible and aperiodic (and therefore ergodic), with the stationary distribution 
being uniform. If the initial path is monotone, the rule (2) has the following 
interpretation, which greatly simplifies some of the proofs we offer: 


Suppose we have a supply of black and white balls, and a series of urns Up, U;, U,,..., which 
initially are all empty. At time ¢ = 1,2,...,a +b an agent Ag places a ball, either white or 
black, into Uj. At each time A,A + 1,..., agent A, takes a ball at random from Up (which at 
time A contains A balls) and places it in U,. At each time 2A,2A + 1,..., agent A, takes a ball 
at random from U, and places it in U,, and so on. For each urn, the number of balls it contains 
starts by rising from zero to A, stays there a while, and then decreases to zero. 


This description is equivalent to that of probabilistic pursuit, if we take a white ball 
for a right-step and a black ball for an up-step, and identify the position A,(t) with 
w + jv where w (respectively, v) is the total number of white (respectively, black) 
balls this agent has seen by time ¢. The number of white (black) balls in urn U,,_, 
corresponds to the x (y) position of A,_, relative to A,. If A, (t) =w +ju and 
A,-@) =w +jvu+x+Jy, so that the urn U,_, contains x white and y black 
balls, then the probability that A, chooses a white ball (so that A,(¢ + 1) = 
w+ 1+ ju) is just x/(x + y). 

Let S be the set of monotonic paths from the origin to a + jb, and let W be the 
Markov chain with state-space S$ and transition probabilities induced by the 
probabilistic pursuit procedure. 

We first show that @ is irreducible. 


Lemma 3. For any two paths s,s' < S there is a sequence of positive-probability 
transitions that leads from s to s’. 


Proof: One can interpret a monotonic path from 0 to a +jb as a sequence of 
a + b characters from the set {u, r}, where r refers to a “right” move and wu to an 
“up” move. There are exactly ar’s and b w’s. It is easy to see that if, in the target’s 
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path s, there is a u at time ¢, followed by an r at time ¢ + 1, then there is a 
positive probability that the pursuer’s path s’ will be equal to s with the only 
exception that s’ has an r at time ¢ and a u at time ¢ + 1. The set S of monotonic 
paths is closed under such “flip” operations—given a path s € S, any other path in 
S can be reached from s by a sequence of (positive probability) “flip” transitions. 
Hence the chain is irreducible. = 


It is easy to see that @ is aperiodic: 
Lemma 4. For any paths € S, p,, > 0. 


Proof: There is always some positive probability that the pursuer follows the 
pursued’s path exactly. | 


Now we show 
Lemma 5. The uniform distribution over S is stationary. 


Proof: The number of different paths from the origin to a + job is 


isi=(""). 


For the uniform distribution of paths, the position at time ¢ (starting from the 
origin at t = 0) is x + jy (where x + y = ¢t) with probability 


Prob (xlm,t,a} = =(1)(™— 4) (4) 


xJ\a-Xx 
This is the hypergeometric distribution, which governs the number of white balls 
(x) in a random sample of t balls chosen from an urn that contains a white and b 
black balls. Thus we can generate a random path by choosing balls sequentially at 
random from an urn that initially has a white and Db black balls. 

Next consider the case when ¢ + A <a +). Suppose the path of the pursued 
(“target”) ant, A,, is chosen uniformly from S, e.g., by drawing from an urn with a 
white and b black balls, and moving right on white and up on black. Using the 
“urn” representation, we can obtain the distribution over all possible paths for the 
kth ant by considering a sequence of urns U,, U,,...,U,,... with the black and 
white balls being moved downstream according to the following rule: 


Start with U, containing a white and b black balls. At each time unit draw a ball at random 
from U, and place it into U, until A balls are accumulated there. Then also start moving 
randomly chosen balls from U, to U, until A balls are in U, and so forth. 


The distribution of paths for the Ath ant is given by the distribution of ball-color 
sequences seen entering the urn U, in this process. Disregarding the color of balls, 
by symmetry all (a + b)! sequences of balls are equally probable to appear as 
inputs to U,. Hence the 


(a+b)! _ [er 
alb! \ a 

possible sequences of black and white balls are also equiprobably seen entering the 

kth urn. = 


The property we have just proved is strongly related to the concept of exchange- 
ability, defined as follows (see [6, pp. 97-105]): A countable sequence of events 
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V,,V,,... is exchangeable if for any possible choice 1 <i, <i, < --- <i, of k 
subscripts, Prob(V,, NV, ++: 1 V;,) = p, depends only on k but not on the 
actual subscripts i;. If the event V; is defined as “a white ball enters the last urn at 
time 7”, then the probability of having a such events does not depend on the order 
in which they occur, hence the sequence is exchangeable and all paths are 
equiprobable. 

The preceding result is quite general. In fact, if we take a sequence of urns with 
a white and Db black balls in the first one and move them downstream, choosing 
balls at random from U, to be placed into U,,,, according to any given schedule 
ensuring that all balls pass through each urn, then all the possible color sequences 
of balls entering each urn have the same probability. This shows that for monotone 
pursuits one can vary the inter-ant intervals arbitrarily, and the paths of the ants 
engaged in pursuit will be uniformly distributed if the first ant chooses a path at 
random from (0,0) to (a, b). This also generalizes to higher dimensions (= more 
colors for balls). Thus the paths generated by this rule are also governed by a 
uniform stationary distribution. 

From Lemmas 3, 4, and 5 we have 


Theorem 2. @ is an ergodic Markov chain and its unique stationary distribution is 
uniform. 


Two immediate corollaries of Theorem 2 are: 


Corollary 1. Assuming stationarity, the average path is the straight line from 0 to 
a + Jb. 


Proof: A standard result for the hypergeometric distribution (4) is that E[x|m, t, a] 
= ta/m. a 


Corollary 2. Assuming stationarity, ants are usually very near the average path. 


Proof: For the hypergeometric distribution (4), the variance of x is 
V[x|m,t,a] =t(m — t)ab/(m — 1)m’. 
Thus if a = am, b = Bm, and t = rm (where a + B = 1) we have: 
V[x(t)] =maBr(1 — rT) + O(1). 
Suppose m is large. We can bound the probability that at time ¢ the ant is 


outside a region of width m* around the average, e being a number in (3, 1). Using 
Chebyshev’s inequality,’ 


Prob \a@ — < > m'| 
at |’ 
= Peo { x(t) - 7 | 2 m| 
< Med aBr(1— r)m'-2* + O(m-**) S77 0, 


m? 


‘Chebyshev’s inequality ([5, p. 376]) says: let X be a random variable with expected value E[_X] and 
variance VLX]. Then Prob{(X — E[X])? > a} < V[X]/a for any a > 0. 
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ay 


Stationary distribution of paths from (0, 0) to (2n, n) 
Gray level - Distribution of sites visited by sample ants 
Width defines the strip where 80% of the probability is concentrated 


Figure 6. Line widths for stationary distribution when a = 2b. 


The normalized width of the strip with positive probability is n*/am, which clearly 
converges to zero when m — ©, See Figure 6 for the line width in the stationary 
distributions for various values of m. | 


3. CONVERGENCE TO THE STRAIGHT LINE IS FAST. We now show that the 
average of the ant-paths converges to the straight line between source and 
destination exponentially fast. 

In the following, we ignore the initial non-monotonic transient, and assume that 
the leading ant A, executes an arbitrary monotonic path. Let us define a new 
entity D, (a determin-ant?) which progresses along the average path of A,, i.e. 
such that at each time ¢, D,(t) = E[A,(¢)]. Then 


Dysi(t +1) = Dyyi(t) + PEAY Poe 5) 


To justify this equation, note that the expectation of the step made by A,,, at 
time ¢ is 
BL A,(¢ + A)] = Ef A .(9)] 
ELA, ..(¢ + 1)] - [4,44()] = = 


Let us denote the average path of the ant A, by the complex vector d= 
(d(O), d(1), d(2),..., d(m)), where m = a + b, and denote the path of the pursuing 
ant by d’ = (d’(0), d’(1), d'(2),..., d'(m)). We measure the distance between these 
two paths by the maximum distance between any of their components, Le., 


dist(d,d') = max |d(i) — @’(i)|, 


where |-| stands for the Euclidean distance. Now we can show that the average 
path approaches its linear limit exponentially fast. 
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Theorem 3 


dist (d,,d..) < ma —am-4y", (6) 


where a = (A — 1)/A. 


Proof: First we show that the limit average path, d,,, is indeed the straight line. We 
can write the evolution equations as 


O<t<m-—A: a(t +1) a(t) = LAY O 


i (7) 
m—-A<t<m: a(et i) -a() = SEO 


with boundary conditions 
d(0) = d'(0) = 0, d(m) =d'(m) =a+)b, 


where the denominators represent the Manhattan distances between A, andA,,,. 
This distance is initially A, and stays constant until A, reaches a + jb, whereupon 
the distance decreases by one per unit of time. Hence we can relate the vectors d 
and d’ in the following way: 


a'(0) = dQ) 
Ad'(1) + Gd — A)d'(0) = d(A) 
Ad'(2) + (A — A)d'() = d(A + 1) 


Adm —A +1) + (1 — Add'(m — A) = dm) 
(A — 1)d'(m — A +2) 4+ Q —- A)d'(m — A+ 1) = d(m) (8) 
(A — 2)d'(m — A +3) + G6 — A)d'(m — A + 2) = d(m) 


2d'(m — 1) + (—1)d'(m — 2) = d(m) 
d'(m) = d(m). 


A fixed point of this linear iterative process is a vector d such that d’ = d. In 
such a vector, d(t + 1) — d(t) must be constant for all t. Otherwise, assume that 
there is a solution for which the sequence d(t + 1) — d(t) is not constant, and 
denote x(t) = Rd(t); the same argument holds for y(t) = Sd(t). Denote by ft, the 
smallest integer in [0,m — 2] such that the difference x(t, + 1) — x(t) is an 
extremum—either a minimum or a maximum. This difference is necessarily 
nonnegative since the path is monotonic. From (7) it follows that 


*Afn ¥ 6) = Hho) = E (x(t +k) — x(t) +k — 1) 


X(to + 1) —x(to) = 5 5 
k=1 


1 re) 
= i |x(t +k) — x(t) +k -1)]. 
8 ay 
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Hence 


min |x(t) +k) —x(t) +k —1)| <|x(t + 1) —x(t)| 
1<k<6 


< max |x(t) +k) —x(t)+k-1)|, 
1<k<6 
where 6 = min{A,m —1t,} > 1. The last inequality is strict since not all the 
differences are equal. But this contradicts our assumption that x(t, + 1) — x(t,) is 
an extremum. Moreover, ¢) cannot equal m — 1, since then both the minimum and 
maximum would occur at the same index, contradicting the assumption that the 
sequence is non-constant. 

Since d(0) and d(m) are not affected by the iterative process, the vector d, 
converges to a limit that is a sequence of points equi-spaced on the straight line 
from d(0) to d(m). 

We next show that the distance from the limit decreases exponentially fast. The 
set of difference equations (8) can be written as: 


@d' = Wd, 
where the matrices ® and W are 


Dont) x(meD <—- A —> 


0 1-—A A 0 


and 


Won+) x(m+1) — 


Note that ® and W are independent of the specific path. Hence, the dynamics of 
the averaged ant-paths is described by 


d= @-!-w-d=P-d 
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i.e., a fixed matrix operator repeatedly acting on the average ant-path vector. Let 
us now sketch the form of this operator and derive a bound on its second-largest 
eigenvalue. 

With some algebraic manipulations, it can be found that 


A-2 1 
(sa) 


A-3 1 
a(—} (aa A—2 


1 1 | ; 


| 
Qo 


—— A —> 


0 
B 
2 
A 
3 
A 
4 4 A-1 
m—-A+t+1 Oe m—A _ see —_—_ 
° fest oyor’e <I A 
0 0-0 0 vs 0 1 


Note that the row sums of P are all 1. 

Since the fixed point of the process d’ = P -d is the straight line from d(0) to 
d(m), and is independent of the entries d(1), d(2),..., d(m — 1) in the initial d, we 
know that as n tends to infinity, P” approaches the form of two non-zero columns 
on left and right, all other entries being zeroes. In order to analyze the rate of 
convergence of this process, let us bound the value of p{”, the (i, j)th entry in P”. 
An observation we need for this purpose is that the sum of the central m — 1 
entries in-any row of P is bounded from above: 


m-1 
» Dip S 1 - am, 
k=1 


with equality achieved at the (m — A + 1)th row of P. Using this observation and 
the fact that the top and bottom entries in the m — 1 central columns of P” are 
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zero for all n, we have the following recursive argument: 


PIP = Lo Die DY 


(9) 


IA 
3 
M1) 
3S 
rama 
= 
Se 
= 
Ss 


<(1- m—-Ay, (n—1) 
(l-a )* max {Df } 


<(1-a™ “Ay” 

Hence, the magnitudes of all the entries of P” except for those in the leftmost and 
rightmost columns tend to zero rather quickly. Now let us consider the Oth and 
mth columns. Due to the special structure of P and the inequalities (9) we have 
that for all 1,0 <i<m, 

m—1 

PS = PP + Le PE? Pao 
k=1 
<p? + (m—1)(1- a4)", 

and 
k 


lA 


PS? — pPP|<(m-1) Y A - a4) 
k=n 

m— 1] 

= Vana (1 — a 


i.e., the leftmost entries of P” approach their limit values exponentially fast, too. 
A similar argument holds for the entries of the rightmost column. We conclude 
that the effect of the initial conditions (i.e., of d(1), d(2),...,d(m — 1) in dy) 
decays exponentially fast, and the average ant path converges to the straight line as 
expressed by (6). a 


m—Ay" 
? 


4. RELATED TOPICS. We now consider several extensions to the probabilistic 
pursuit model. 


4.1. Probabilistic Linear Pursuit. Consider two ants, the first of which, Ag, is 
happily hopping along a straight line parallel to the y-axis: A,(¢) = r + jt, where r 
is a constant. A second ant, A,, is chasing Ay, and both are traveling at the same 
speed. Using our probabilistic pursuit model, one can get an equation for the 
average trajectory of A,(t), similar to the corresponding deterministic results 
found in [1, pp. 251-253] and [4, pp. 113-127]. 


Theorem 4. If A, is launched from (r,0) at time 0 and is going upwards at speed 1, 
and if A, is launched from (0,0) at time 0 and is pursuing A, according to the 
probabilistic pursuit model, the average behavior of A(t) is described by the curve 
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Proof: Since the behavior of the ants can be described by the equations 
Aj(t) =rt+jt 
A,(0) = 0 (10) 
A(t) = A(t — 1) + 6(¢), 


where 64(t) is the random variable defined in (2). Since the rectilinear distance 
between them is always 7, the average y-coordinate of A, at time f¢ is 


t—1-—y,-1 


Y, =Y-1 + 
r 


with initial condition y, = 0. Substituting a = (1 — +), B = +, and using the fact 
that x) = y, = 0, it turns out that 
yy = ay, + B(t— 1) = a(ay,_, + B(t— 2)) + B(E- 1) = 


t—1 


t—1 1\! 
=Bp»y a*“"(t —k) = Ba‘! yi ka * = (1 — -| +t-r. 


k=1 k-1 


k=0 l-—a r 
hence 
r—x 
lo 
r 
y(*) = r-1\ a 
log| | 


This result is quite similar to the one obtained for continuous linear pursuit 
[1, p. 251): 


(x =r)" 


Cc 
Ac — 5 log(r — x) + c', 


y(x) = 
where c,c’ are constants. The difference is explained by the different measures of 
distance involved: in our model the ant moves toward its target with a constant 
speed, maintaining a constant Manhattan distance to it, but the length of the 
average Step it takes in the direction of the target varies, while in [1] the pursuit is 
carried out with constant Euclidean velocity pointed at the chased ant. Note that 
the Euclidean ant is asymptotically at distance r/2 behind its target, while the 
Manhattan ant never decreases its distance below r. See Figure 7 for a graphic 
comparison of pursuit path induced by these two models. 


4.2. Probabilistic Cyclic Pursuit. Assume that A = {A,,A,,...,A,} is a set of 
ants, chasing each other cyclically, that is: A, is chasing Ay, A, is chasing A,, etc., 
and A, is chasing A,. The set A begins at positions A(Q) at time ¢ = 0 and then 
evolves on according to the probabilistic pursuit rules defined in the previous 
section. 
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Manhattan vs. Euclidean Pursuit 


X - X pursued 
+ - + Manhattan pursuer 
o - 0 Euclidean pursuer 


PAA RRRA NS 
aah: RRRAA ABA 
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0 0.5 1 1.5 2 2.5 3 3.5 4 


Figure 7. Comparison of the Manhattan and Euclidean models of pursuit. 


Denote by C, the Manhattan circumference of the set A: 
C,= | Aisi(t) -— 4,(2)| 
i=0 


where ||u — v|| denotes the Manhattan distance between points u and v. In [2] and 
[3] it was shown that ants engaged in deterministic cyclic pursuit always converge 
to a point of mutual encounter (and all captures are almost always simultaneous, 
see [7]). Here we shall show that the ants reach a limit cycle, each ant being not 
more than one unit of distance away from its chaser. 


Theorem 5. Ants engaged in cyclic probabilistic pursuit with initial distances 
d,,d,,...,d, converge to a limit cycle with circumference C,, = X7_) (d; mod 2). 
Moreover, this convergence is exponentially fast: for any given € > 0, if t > t)(e) = 
O(log(=)) then Prob{C, = C,} > 1 — «. 


Proof: Inter-ant distances never increase in probabilistic pursuit, hence C, is a 
non-increasing positive, hence convergent, sequence. Arguments similar to those in 
the proof of Lemma 2 show that whenever the distance between two ants is greater 
than 1 there is a positive probability, bounded from below, for a decrease (by 2) in 
this distance, provided the pursued ants’ path is non-monotonic. But, in the case of 
cyclic pursuit, the paths of all ants are obviously non-monotonic, since they all have 
infinite length and are confined to the “bounding box” of the initial configuration. 
Hence C,, must correspond to a limiting pursuit configuration in which all 
distances are less than 2, proving the first part of the assertion of the theorem. 
To prove that the convergence is exponentially fast, note that, as in the proof of 
Lemma 2, the inter-ant distance drops by 2 with probability higher than 
| 1 \" of non-monotonic run 1 Co 


2 


> 


2 
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(since C, is an obvious upper bound on all such runs) each time a non-monotonic 
run occurs in the pursued ant’s trajectory. But this happens at least once every C, 
steps (since the ant must stay within a bounding box of Manhattan perimeter of at 
most C,). Hence we have 


1\% 
Prob{C,,.¢, < C, — 2IC, > C,} = = 


In order to get Prob{C, = C,} > 1 — €, we must (as in Theorem 1) have ¢ of the 
order of log(1 /e). a 


The limit cycle may be a polygon with (up to) nm + 1 vertices, as long as the 
length of each edge is exactly one unit; see Figure 8 for an example. Such a 
polygon is stable since in this case each ant A,,, “replaces” the pursued one A, 
the overall shape is preserved. Figures 9-14 exhibit simulation examples of the 
probabilistic cyclic pursuit. For each of the initial configurations we show the 
evolution of the probability distribution calculated over a large number of experi- 
ments, as well as the actual ant locations in a single experiment. It would be 
interesting to investigate the relation between the shape of the initial polygon 
whose vertices are A,(0), i = 0,1,...,, and the shape of the limit cycle. 


5, CONCLUDING REMARKS. Many of the results of this paper continue to hold 
when the lag A is not held constant, but is allowed to vary from one ant to the 
next. We could also allow for the chasing ant to be guided by an ant other than the 
one immediately ahead. To achieve the asymptotic results, we need only ensure 
that eventually the current ant is many generations removed from the first one. 
Also we need to have A > 2 infinitely often at each stage of the walk. 

The results discussed in this paper can be generalized to three (or more) 
dimensional space. The probability of A,,, moving along each axis will, in this 


Figure 8. A possible limit cycle for a cyclic pursuit. 
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Cyclic ants pursuit 
Number of Ants=8; Time=100 
Number of experiments=50; 


Figure 9. Probability distribution in cyclic pursuit—initial configuration 1. 


j=T74 


Cyclic ants pursuit 

Number of Ants=8; Time=100 

Result of one experiment out of 50; 

Initial M-distances=[ 13 14 13 20 47 54 27 40] 
Final M-distances=[ 1 0 101041 0] 


Figure 10. A single run of cyclic pursuit—initial configuration 1. 
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Cyclic ants pursuit 
Number of Ants=8; Time=120 
Number of experiments=50; 


Figure 11. Probability distribution in cyclic pursuit—initial configuration 2. 


t=89 t=104 t=119 


Cyclic ants pursuit 

Nymber of Ants=8; Time=120 

Result of one experiment out of 50; 

Initial M-distances=| 20 20 20 20 20 20 20 20] 


Final M-distances=[ 0 0 0 0 0 0 0 QO] 


Figure 12. A single run of cyclic pursuit—initial configuration 2. 
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t=105 
Cyclic ants pursuit 


Number of Ants=8; Time=120 
Number of experiments=50; 


Figure 13. Probability distribution in cyclic pursuit—initial configuration 3 


Cyclic ants pursuit 

Number of Ants=8; Time=120 

Result of one experiment out of 50; 

Initial M-distances=[ 39 41 39 39 41 39 38 38] 


Final M-distances=[ 1 1 1 1 1 1 0 QJ 


Figure 14. A single run of cyclic pursuit—initial configuration 3. 
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case, be proportional to the projection of the vector A, — A,,, along this axis. 

Ants obeying the probabilistic pursuit model have the property of moving, on 
the average, in the same direction as a continuous pursuit. However, their speed is 
not constant since it depends on the location of the chaser relative to the target. 
To overcome this problem, for purposes of approximating continuous pursuit, one 
might consider the following Euclidean probabilistic rule of pursuit: 


I Id,| 
P, = Prob {6,,,(¢ + 1) = sign(d,)} = 5 Tarts at 
Py = Prob { 8, 41(¢ +1) =j: sign(d,)} = . : let (11) 
P, = Prob {8,,,(t + 1) = 0} -1- . | Wipch 
xT Gy 


where d, =x,(t + A) —x,,,() and d, =y,(t + A) — y,,,(¢) are defined as be- 
fore. The additional “Euclidization” factor does not affect the average direction of 
the chaser, but does normalize its velocity to 3, independent of the target’s 
location: it is easy to verify that P, + P, + Py = 1 and that (P? + P?)'”? = 3. It is 
an open question whether some or all of our results hold for this model. The main 
difficulty is caused by the non-zero probability for the chaser to stay at its current 
location, which means that the pursuit distance is not monotonically decreasing, as 
it is in the Manhattan case. 
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Hipparchus, Plutarch, Schroder, and Hough 


Richard P. Stanley 


1. HIPPARCHUS AND PLUTARCH. Plutarch was a Greek biographer and 
philosopher from Chaeronea, who was born before A.D. 50 and died after A.D. 120. 
He is best known for his Parallel Lives, which inspired such Renaissance writers as 
Montaigne, Shakespeare, Dryden, and Rousseau. His many other works have been 
gathered together under the name Moralia, “a collection of comparatively short 
treatises and dialogues which cover an immense range of subjects, literary, ethical, 
political, and scientific” [21, p. 8]. Part of the Moralia consists of the Table-Talk, “a 
collection of dialogues purporting to reproduce the after-dinner conversation of 
Plutarch and his friends and relatives on various occasions” [20, p. 2]. In the 
Table-Talk [20, VIII.9, 732] appears the following statement: 


Chrysippus says that the number of compound propositions that can be made from only ten 
simple propositions exceeds a million. (Hipparchus, to be sure, refuted this by showing that on 
the affirmative side there are 103,049 compound statements, and on the negative side 310,952.) 


Chrysippus (c. 280-207 B.c.) came to Athens around 260 and became a leading 
Stoic philosopher. Hipparchus was a Greek astronomer (c. 190-after 127 B.c.) 
from Nicaea in Bithynia (now Iznik, Turkey) who spent much of his life at Rhodes. 
He was perhaps the greatest astronomer of antiquity. He is most famous for his 
discovery of the precession of the equinoxes, based on his own observations and 
those of Timocharis 160 years earlier. For further information on the work of 
Hipparchus, see [19, Book I, E], [32]. Hipparchus was an excellent mathematician 
(though for a contrary view see [33, p. 211]); he was the first person to make 
systematic use of trigonometry, and he was probably the inventor of stereographic 
projection. However, for many centuries no one was able to make sense of the 
statement of Plutarch. For instance, T. L. Heath [12, vol. 2, p. 256], a standard 
older authority on Greek mathematics, says of Plutarch’s statement that “it seems 
impossible to make anything of these figures,’ while the more recent authority 
O. Neugebauer [19, p. 338] states that Plutarch’s statement “[has], however, so far 
eluded a satisfactory explanation.” Similarly W. and M. Kneale [16, p. 162], 
authorities on the history of logic, remark that “It is difficult to make any 
satisfactory sense of the passage.” N. L. Biggs [2, p. 113] notes the paucity of 
combinatorial computations by the ancient Greeks and referring to Plutarch’s 
passage says that “the most interesting of them is also the most mysterious.” 
A number of eminent mathematicians and historians of mathematics, such as 
M. Cantor, J. Tropfke, S. Gunther, and E. Artin, have attempted to understand 
Plutarch’s statement without success. An attempt to reconstruct Hipparchus’ pro- 
cedure appears in [1], though it will be apparent from our discussion that this 
attempt is incorrect. Another incorrect speculation appears in [30, p. 63]. 


2. SCHRODER. Friedrich Wilhelm Karl Ernst Schroder was a German logician 
who was born in Mannheim on November 25, 1841, and died in Karlsruhe on June 
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16, 1902. He passed the doctoral exam at the University of Heidelberg in 1862 and 
had positions in Zurich (at the Eidgendssische Polytechnikum), Karlsruhe, 
Pforzheim, and Baden-Baden, before accepting a post as full professor at 
Karlsruhe in 1876. Schroder worked mainly on the foundations of mathematics, 
notably with combinatorics, the theory of functions of a real variable, and mathe- 
matical logic. He was one of the first persons to accept Cantor’s ideas in set theory 
and was one of the developers of mathematical logic in the second half of the 
nineteenth century. Schréder is best known to combinatorialists for his paper [25], 
in which he discusses four “bracketing problems.” The first two problems concern 
the bracketing or parenthesization of a string of letters that we may assume to be 
all identical, say the letter x. The second two problems are analogues of the first 
two where the string of letters is replaced by a set of elements. We will discuss only 
the first two problems here. 

The formal definition of a bracketing is the following. First, x itself is consid- 
ered to be a bracketing. Recursively define a bracketing to be a sequence 
B=(B,,..., B,), where k > 2 and each B, is a bracketing. We represent the 
bracketing B as a parenthesized string of x’s. Thus, think of B as a k-ary product 
(B,)(B,)-::(B,). If some B, is the single letter x, then we remove the parentheses 
surrounding 8B, for clarity of notation. Thus, for example, the bracketing 


(200) (( 20000) (206) ) (20( 20") ) (1) 
represents a way of multiplying 14 x’s whose last operation was a ternary operation 
(B,)(B,)(B;), where B, = xx, B, = (x0xxx)x(xx), and B, = xx(xx), and similarly 
for B,, B,, and B,. There are exactly eleven bracketings of four letters, namely, 


XXXX (XX) aK x( xx) x xX( xX) (200) x xX( xxx) 


((ax)x)x (x(x) )x (2) (xx) x((ax)x) x(x(22)). 
Note that the last five of these are built up entirely from binary operations and are 
therefore called binary bracketings. 

There are three fundamental equivalent ways to represent a bracketing in 
addition to a parenthesized string discussed above: as plane trees, polygon dissec- 
tions, and Lukasiewicz words. We now briefly describe these alternative represen- 
tations. If B is a bracketing, then we first define the plane tree t(B) correspond- 
ing to B. If B consists of a single letter, then 7(B) is a single root vertex. If 
B =(B,,...,B,) then r(B) consists of a root vertex (drawn at the top), with 
subtrees 7(B,),...,7(B,), drawn in that order from left to right. Thus, the key 
property defining a plane tree is that the subtrees of every vertex are linearly 
ordered. For instance, the plane tree corresponding to the bracketing of equation 
(1) is shown in Figure 1. Note that a binary bracketing corresponds to a binary 
plane tree, i.e., a plane tree for which every non-endpoint vertex has exactly two 
SUCCESSOTS. 

Next we consider polygon dissections. Let P be a convex polygon. A dissection 
of P is obtained by drawing some diagonals that don’t intersect in their interiors. 
Thus, P is divided up into regions that are themselves convex polygons. In 
particular, if P has m sides and we draw m — 3 such diagonals (the maximum 
number possible), then we obtain a dissection for which every region is a triangle; 
such dissections are called triangulatigns. We now explain how to associate a plane 
tree t(D) with a polygon dissection D. We associate with the “degenerate” 
polygon with just two vertices a single root vertex. Now fix once and for all an edge 
e of the polygon P, called the root edge. In a given dissection D, the edge e is 
contained in a unique polygon Q that is a region of D. Let k + 1 be the number of 


1997] HIPPARCHUS, PLUTARCH, SCHRODER, AND HOUGH 345 


Figure 1. A plane tree. 


Figure 2. A polygon dissection. 


edges of Q. If we remove the edge e and the interior of Q from D, then we are 
left with dissections D,, D,,..., D, of k polygons (some possibly with just two 
vertices), reading counterclockwise from e along the boundary of Q, such that D, 
and D,,, intersect at a single vertex for 1 < i < k — 1. Define recursively t(D) to 
be the plane tree whose subtrees of the root are t(D,),...,7(D,) in that order. 
Note that if P has n + 1 vertices, then t(D) has n endpoints. Figure 2 shows the 
polygon dissection corresponding to the tree of Figure 1. 

Finally we consider Lukasiewicz words. The letters of such words come from 
the alphabet A = {Xo, x1, X,,...}. The weight 5(x;) of a letter x; is defined by 
5(x;)=i-—1. A word y,y,°:: y, made of letters from A is said to be a 
Lukasiewicz word if 8(y,) +++ +6(y,) 20 for 1<j<m-—1, and 6(y,) 
+ +++ +6(y,,) = —1. Thus, y,, =X . The set of all Lukasiewicz words is called the 
Lukasiewicz language [17, Ch. 11.3]. To obtain a Lukasiewicz word w(7) from a 
plane tree 7, do a depth-first (preorder) search through the tree. By definition, this 
is a linear ordering 6(7) = v,,05,..., v, of the vertex set of 7 defined recursively 
by 5(7) = v, 6(7,),..., 6(7,), where v is the root of 7, and 7,,...,7, are the 
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subtrees of v Gin that order). Define 


w(T) — X deg( v1) * deg v3) _ X dep( v,)? 


where deg(u,) denotes the degree (number of successors or children) of vertex v,. 
For instance, the Lukasiewicz word corresponding to the plane tree of Figure 1 is 


2 6y vty y2y. v2 
XX yXQXzXgXQXy~XHXzXpX7Xo. 


Note that since our bracketings B do not allow unary operations, the plane tree 
7(B) has no vertices of degree one, and the corresponding Lukasiewicz word does 
not involve the letter x,. 

The correspondences we have established are easily seen to yield the following 
result. 


Proposition. (a) Let s(n) denote the total number of bracketings of a string of n letters. 
Then s(n) is also equal to (i) the number of plane trees with no vertex of degree one 
and with n endpoints, (ii) the number of dissections of a convex (n + 1)-gon, and (iii) 
the number of Lukasiewicz words with no x,’s and with n x9's. 

(b) Let b(n) denote the number of binary bracketings of a string of n letters. Then 
b(n) is also equal to (i) the number of binary plane trees with n endpoints (and hence 
with 2n — 1 vertices), (ii) the number of triangulations of a convex (n + 1)-gon, and 
(iii) the number of Lukasiewicz words with n x,’s and n — 1 x,’s (and with no other 
letters); such words, usually with the last x, deleted, are sometimes called Dyck words. 


We are now ready to explain the contribution of Schroder to these bracketing 
problems. Schréder’s first problem asks for the number b(n) of binary bracketings 
of a string of n letters. Using a generating function argument, Schrdder derives the 
formula (stated slightly differently) 


Thus b(n) is just the Catalan number C,_,, for which an enormous literature 
exists. For some further information and references, see [11], [14]. A list of about 
fifty combinatorial interpretations of Catalan numbers will appear in [31, Exercise 
6.17] and is available on the World Wide Web at 
http: / /www-math.mit.edu / ~ rstan /ec/ec.html. 

Schroéder’s second problem asks for the total number s(1) of bracketings of a 
string of n letters. Schrdder’s main result on his second problem is the generating 
function 


1 
di s(njx" = 7(L+x—V1— 6x +x"). (2) 
n>=1 
He also gives the values (with the typographical error 145 for s(5) = 45) 
(s(1),...,5(10)) = (1, 1,3, 11,45, 197, 903, 4279, 20793, 103049). (3) 


Perhaps the quickest way to obtain equation (2) is the following. Let y denote the 
left-hand side. The recursive definition of bracketing is equivalent to the formula 


2 
yHxty tyr ty? +e =x 


= (4) 
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Multiplying by 1 — y yields the quadratic equation 
2y*-(l1+x)y+x=0. (5) 


One of the solutions is spurious, and the other one is just the right-hand side 
of (2). 

The numbers s() are now called Schroder numbers. Schroder does not mention 
any other combinatorial interpretations of Schrdder numbers, nor does he give a 
single outside reference. Let us point out some additional references. The problem 
of counting the triangulations of a convex polygon was raised by Segner [26] and 
solved (anonymously) by Euler [9]. The connection between bracketings and plane 
trees was known to Cayley [4]. The bijection between plane trees and polygon 
dissections appears in Etherington [8], with a sequel by Erdélyi and Etherington in 
[7]. The bijection between bracketings and Lukasiewicz works is essentially the 
“reverse Polish notation” or “parenthesis-free notation’ developed by the Polish 
logician Jan Lukasiewicz (1878-1956). He came upon the idea of this notation in 
1924 and first published it in 1929, as explained in [18, p. 180, footnote 3]. The 
connection between reverse Polish notation and enumerative combinatorics ap- 
pears in a pioneering paper of George Raney [22]. 

There is now a considerable literature on Schr6der numbers and related 
numbers. To get into this literature, see [3], [15, p. 55], [23], [27], and [34]. Let us 
also mention that it is easy to obtain a simple recurrence relation [5], [6, p. 57] for 
the Schroder numbers that allows them to be computed rapidly. Namely, differen- 
tiate (5) with respect to x and solve for y’ to obtain 
y-1 (x—-—3)y—-x+1 


 dy—-1—x x? —6x+1 


t 


y 


? 


the latter equality a consequence of the quadratic equation (5). Hence 
(x? -— 6x 4+ 1)y’-(x-3)ytx-1=0. 


Expanding the left-hand side in a power series in x and setting the coefficient of 
x” equal to 0 yields 


(n + 2)s(n + 2) — 3(2n + 1)s(n + 1) + (nm — 1)s(n) =0, n=l. (6) 


No direct combinatorial proof of this formula was known until D. Foata and D. 
Zeilberger, after reading an earlier version of this paper, found such a proof [10]. 


3. HOUGH. The stage is now set for the dénouement. The astute reader may have 
already anticipated it by comparing Plutarch’s cryptic statement with the values (3) 
of the Schréder numbers. In January 1994 David Hough (1949-), a graduate 
student at George Washington University (who decided only in 1992 that he would 
pursue a career in mathematics), noticed that the mysterious number 103,049 of 
Plutarch, i.e., the number of compound propositions that can be formed from ten 
simple propositions, is just the tenth Schroder number! Hough learned about 
Plutarch’s statement from [30, Exercise 1.45]. Hough’s discovery strongly suggests 
that Hipparchus was carrying out a calculation equivalent to the modern calcula- 
tion of the number of bracketings of a string of ten letters. However, it remains to 
determine exactly what Hipparchus and Plutarch meant by a “compound proposi- 
tion.” In Stoic logic, compound propositions are built up from simple ones using 
such connectives as “and,” “or,” and “if... then” [16, Ch. ITI.5]. This does not seem 
like enough information to pinpoint precisely what Hipparchus had in mind. 
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We can also ask how Hipparchus computed the number 103,049. As noted in 
[24, p. 101], this number is much too large to have been computed by a direct 
enumeration of all the cases. Moreover, it is highly unlikely that Hipparchus was 
aware of the sophisticated recurrence (6). More probable is that Hipparchus used 
the “obvious” recurrence (equivalent to equation (4)) 


s(n)= sis) (i), 2 = 2, (7) 


ijt +i,=n 


where the sum ranges over all ways to write n as an (ordered) sum of k > 2 
positive integers. The sum on the right-hand side of equation (7) in the case 
n = 10 has 511 terms. There are only 41 “essentially different” terms, correspond- 
ing to the 41 partitions of 10 into a least two parts, i.e., the 41 ways to write 10 as 
an unordered sum of at least two positive integers. If the terms of the sum are 
grouped according to the partition of 10 to which they correspond, it is still 
necessary to count the number of ways of ordering each partition. For instance, the 
partition 3+2+2+1+41+1 has 60 orderings of its terms, thus contributing 
the amount 60s(3)s(2)*s(1)? to the sum (7). We cannot but admire Hipparchus’ 
ability to compute the Schréder number s(10) at a distant time when not even a 
remotely similar accurate computation is known. For further information about 
combinatorics in ancient times, see [2], [24]. 

The number 310,952 in Plutarch’s statement, i.e., the number of compound 
propositions that can be formed from ten simple propositions “on the negative 
side,” remains an enigma. Many possible variants of plane trees have been looked 
at without success. Moreover, Neil Sloane has verified that the numbers 310,952 
and 103,049 + 310,952 = 414,001 do not appear anywhere in the valuable tables 
[28]. Thus the mystery of Plutarch’s statement remains at most half solved. 
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NOTES 


Edited by Jimmie D. Lawson | 


Many Correct Digits of 7, Revisited 


Gert Almkvist 


Let n be a positive integer divisible by 4. Then 


1-7 tte — to + 2" 7% = a 
1+t 
Integrating from 0 to 1 we obtain 
7 i il 1 1 1 
G7 aretan =f adt=1-at+e-at — 7 tT Rn 


where 

1 0" 
R,= | —rd. 
” ew: 


One easily estimates 


“Wnt << eT? 
so if we try to compute 7 taking n/2 terms in Gregory’s series 
1 1 1 1 
(l-5+ 5-54 sq] 


then the error is of order 1/n. This did not prevent R. D. North from computing 
(to 40 digits) 


alt 1 1 1 
—_ — + — — ore 
3 «5 999999 | 
= 3.14159 06535 89793 24046 26433 83269 50288 4197 


while a = 3.14159 26535 89793 23846 26433 83279 50288 4197. 


So only 4 out of 40 digits are wrong. This remarkable fact was explained in the 
celebrated paper [2] by the Borwein brothers and Karl Dilcher. Here we offer two 
other ways to do this. 


First method. Make the substitution t = e * in the remainder term 
1 th co e (nt)x 1 .0 e 7% 
R,= | ——-d= ——dx = = dx 
" eer: | trem 2/9 cosh x 
Let f(x) = 1/cosh x. Then 


2k 2m 


ey a SB 4 pom gy 
f(x) d Bap! +f (2) ny! 
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where the E,’s are the Euler numbers 
Eo = 1, EF, = —1, FE, = 5, E = —61,°:: 
and EL, = 0 if n is odd. 


Lemma. |f@”(x)| < |E,,,| for all m and x. 


Proof: (Jan Gustavsson): The trick is to use the fact that f(x) is essentially its own 
Fourier transform 


© COS xt 
= = ——$—$———— t 
(+) cosh x J, cosh wt/2 


Differentiate 2m times and take absolute values 


o t”™ cos xt 
FO (x) =|(-1) "f cosh wt/2 , 


co 2m 


Jy cosni7ay OO = Fan . 


Integrating we obtain (using the Laplace transformation) 


E,, (2k)! 


2K, = ¥ ote ek de + To my = ~ Cb ett Tom 
1 1 5 61 Em? 
_ [- sy + mn nn. + yumi + Lon, my? 
where 
Team $5 
by the Lemma. It follows that 
41-24 0-6 ag 2g te gor 
3 «5 n—1 n nw ww n’ (n,m) 


Putting n = 10° and m = 3 we find the digits different from those of a in North’s 
computation. 


Second method. Karl Dilcher told me how the paper with the Borweins was 
written during one hectic weekend. As it turns out, the main idea could have been 
found in about one second by Maple! Here is the crucial line 


asympt (simplify ((sum(-4*(-1) 3 / (2*j-1), j=il..n/2)-Pi)/ 
(-1) (n/24+1)),n,8). 


The answer is 
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The BBD-paper was cited by Andrews [1] as an example of the superiority of 
humans over the computer. Now the one-line Maple program above will of course 
be “gefundenes Fressen” for Zeilberger in his quest for arguments for the com- 
puter. 
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A Note on a Cake Cutting Algorithm 
of Banach and Knaster 


Martin L. Jones 


The following famous problem was proposed by Steinhaus [4]. Can an object such 
as a cake be divided among n participants in such a way that each participant 
receives a piece equal to 1/n of the total cake according to his or her value 
system? Since its proposal there have been many solutions. One of the more 
elegant is the simple constructive algorithm attributed to Banach and Knaster by 
Steinhaus, which effects the desired division in the following manner. A long knife 
is passed parallel to itself slowly over the cake until some participant yells “stop,” 
at which point the cake is cut and the piece just described by the knife is given to 
that participant. Ties are broken arbitrarily. The procedure is repeated with the 
remaining participants and with what remains of the cake. An implicit assumption 
of this algorithm is that the participant’s values (measures) of the cake change 
continuously with the position of the knife. If the measures are assumed only to be 
nonatomic, that is, single points of cake have measure zero, then the Banach and 
Knaster algorithm might fail. Nonatomic measures can still have positive measure 
on lines or planes of cake. Imagine if a line of frosting parallel to the knife blade 
had measure one for each participant. Everyone would yell “stop” at the same 
time. In this case, a simple reorientation of the knife blade might solve the 
problem. That this can be done in general is the focus of this note. 

Let ys be a probability measure defined on (R”, 6”), where 6” is the Borel 
o-algebra in R”. The probability measure pw is said to be nonatomic if each 
individual m-tuple in °6” has measure zero. However, subsets of dimension one or 
higher, such as lines and planes in 8”, may have positive w measure. A set in 56” 
that is a translation of a k-dimensional subspace of R” will be referred to as an 
affine subspace of dimension k. For example, lines are affine subspaces of dimen- 
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sion 1, planes are affine subspaces of dimension 2, etc. For each k = 1,2,...,m, 
let 


H, = {A € 8”: A is an affine subspace of dimension k}, 


and let 
Hj := {A € H,: w(A) > Oand pw(C) = 0 
whenever C C A is an affine subspace with dim(C) < dim(A)}. 


The set H,;* is the collection of affine subspaces of dimension k that have positive 
js measure, but all of whose proper affine subspaces do not. The following theorem 
restricts the cardinality of H,* under the assumption that pw is nonatomic. 


Theorem. Let y be a nonatomic probability measure on (R™, 8"). Then the set H; 
is countable for each k = 1,2,...,m. 


Proof: Let Hf(n) = {A € Hf: p(A) > 1/n} for each n=l, so AX = 
U°_, Hi (n). We will show that card H,*(n) <n by contradiction. Suppose card 
H;(n) > n, and let A,,...,A,4, be distinct affine subspaces of dimension k in 
H;‘(n). By the principle of inclusion-exclusion we have 


n+1 


= Xu H(A;) — 


n+i1 


U A; 
j=l 


bh 


i<j 


Li u(4;9A;) +> +(-1)"** (ALN... NA, 44). 


Since the intersection of any two or more distinct affine subspaces of dimension k 
is either empty or an affine subspace of dimension less than k, it follows from the 
nonatomic property of uw and the definition of H; that 


n+1 


~ Xu u( Aj). 


n+1 


U 4; 
j=l 


bh 


However, since each A; belongs to Hf (n) we have 


n+1 


U A; 
j=l 


n+i1 n 


= Xu (A;) > 


+ 1 
> I, 
n 


bh 


contradicting the fact that ys is a probability measure. Therefore card H;*(n) < n, 
and card H;* is countable. a 


Let the cake be represented by the solid unit sphere S in R°, let 8 denote the 
Borel o-algebra of subsets of S, and let 4 be a nonatomic probability measure on 
(S, 8). As the knife passes over the cake during the division process, the plane 
containing the knife blade partitions the cake into two pieces. To each possible 
orientation of the blade, there correspond two unit normals, v,; = a,i + b,j + c,k 
and v, = a,i + b,j + c,k. The points (a,, b,, c,) and (a,, b,, c,) on the surface of 
S are the points of tangency to the plane when it “enters” and “exits” the cake. 
Therefore, to each possible orientation of the knife blade there correspond two 
points on the surface of S. Let 7 be normalized Lebesgue measure on the surface 
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of S. The following corollary shows that the 7 measure of the “troublesome” 
orientations is zero. 


Corollary 1. Let x be a nonatomic probability measure on (S, 6). Let V be the set of 
all triples (a, b, c) such that v = ai + bj + ck is a unit vector in R° for which there 
exists a plane p in 8 normal to v with w(p) > 0. Then nV) = 0. 


Proof: First note that if p is a plane in 8 with u(p) > 0, then either p contains a 
line with pw positive measure or it does not. Let V, be the set of triples 
(a,b,c) € V for which v = ai + bj + ck is normal to a plane in % that contains a 
line with positive m» measure. Let V, =VOV;,. To each L € H# (so that 
CL) > 0), there correspond uncountably many planes containing L. The normals 
to these planes all lie in the plane through the origin normal to L. Therefore, each 
L € H§ contributes a “great circle” of triples to V,. By the Theorem, card H} is 
countable, so V, consists of the triples on countably many great circles on the 
surface of S. If p does not contain a line with positive 4 measure, but has positive 
p measure itself, then p € H;. For each such p there are two unit normals. By 
the Theorem, card H; is countable, so card V, is countable. Thus, 7(V) = 7(V, 
U V,) = surface area (V, U V,) = 0. a 


By repeating this argument for each of the n nonatomic probability measures, 
we obtain the following extension of Corollary 1. 


Corollary 2. Let w,,..., «4, be nonatomic probability measures on (S, 8). For each 
i= 1,...,n, let V, be the set of all triples (a, b,c) such that v = ai + bj + ck is a 
unit vector in R° for which there exists a plane p in 8 normal to v with p,(p) > 0. 
Then (U 1_,V,) = 0. 


Corollary 2 proves even more than was originally intended. Not only is it 
possible to utilize the Banach and Knaster algorithm for nonatomic measures, but 
if the orientation of the knife blade is chosen at random, a useable orientation will 
be chosen with probability one! 
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The Impossibility of Unstable, Globally 
Attracting Fixed Points for Continuous 
Mappings of the Line 


Hassan Sedaghat 


It is possible for a fixed point of a dynamical system to locally repel some 
trajectories, yet globally attract all trajectories. For example, consider the mapping 


_f[—-2x ifx<a 
fax) = (6 if x >a 


where a is any fixed positive real number. Then the first order difference equation 


Xner =f Xn) n=0,1,2,3,... (1) 


has a solution 


(—2)"x) if(—2)" x) <a 


x, =ff(%) = 

n fa (40) | if (—2)""'x, >a 

for every choice of x, € R (f? represents the n-th iterate of f, under function 
composition). Clearly, once x, >a for any k, then x, =0 for all n>k. In 
particular, every solution of (1) converges to zero, regardless of the choice of x9. In 
this sense, the origin, which is the unique fixed point of f,, is globally attracting. 
However, if x) # 0, then no matter how close x, is chosen to the origin, x,, must 
first exceed a before ultimately reaching the origin. Hence, the origin is unstable 
(in fact, locally repelling). 

The preceding example shows that globally attracting fixed points that are not 
stable can easily occur in one-dimensional dynamical systems such as (1). Since f, 
is discontinuous at x = a, it is natural to ask whether a continuous example of an 
unstable global point attractor can be constructed in one dimension. As the title of 
this note suggests, this is not possible. To see why continuous maps are nice in this 
sense, we need a local or asymptotic stability result from [7, p. 47]. Complete 
definitions of all concepts and terminology used here can be found in [2] and [5]. 


Criterion for asymptotic stability of fixed points: A fixed point x of a continuous 
map f is asymptotically stable if and only if there is an open interval (a, b) containing x 
such that f?(x) > x fora <x <xand f*(x) <x forx <x <b. 


The preceding criterion is remarkable for not requiring any differentiability 
conditions on f. Now we are ready to demonstrate our main result: 


A continuous mapping of the real line cannot have an unstable fixed point that is 
globally attracting. 

Suppose, on the contrary, that a continuous mapping f of the real line has an 
unstable fixed point x that is also globally attracting. Since there can be no 
periodic solutions, the iterate f? crosses the identity line only at x. Hence, only 
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one of the following two cases is possible: 

(I) f7(x) >x for x <x and f7(x) <x for x > Xx; 

(II) f*(x) <x for x <x, or f*(x) > x for x > X. 

By the preceding Criterion, Case (I) implies stability and must therefore be 
ruled out; this leaves Case (II). Assume that f*(x) >x for x > x, and let x, > x. 
Then f*(x)) > x9; as this implies f7(x)) > X, repeated applications of f? to x, 
generate the increasing sequence 


X< xq <f*(X) <f%(%) <> 


By continuity, f?"(x,) ~ » as n — ©, implying that {f”(x,)} does not converge to 
x. The case f*(x) <x for x <x reaches a similar contradiction, so we conclude 
that our original assumption on x was false. 

A natural question with regard to the preceding impossibility result is whether 
one dimensionality is necessary (in addition to continuity) in order to rule out the 
existence of unstable global point attractors. The answer is indeed affirmative, and 
examples of continuous (in fact, differentiable) planar maps having unstable, 
globally attracting fixed points exist in the literature; see, e.g., [4, p. 90], or the 
discretization of the continuous time example in [1, p. 59]. Unstable fixed points 
that are globally attracting can also arise in a continuous second order difference 
equation, which is a very special type of a two dimensional system. Generally, a 
second order difference equation has the form 


Yn+1 =F(y,, Yn—1) n= 0,1,2,3,..., (2) 


where F: R* — R and real numbers y,, y_, are specified as initial conditions. A 
fixed point of (2) is a solution of F(y, y) = y. The particular F that we discuss here 
is not an artificial construct of purely theoretical interest; rather, it comes from the 
classical Hicks model of the trade cycle, an early mathematical model that aimed 
to explain well-documented fluctuations in economic output or GNP that cause 
recessions periodically; see [3]. The simplified, static Hicks model with a single- 
period lag is given by equation (2) in which F is the continuous, piecewise linear 


mapping 
F(u,v) = min{K,a + bu + c max{u — v, d}} (3) 


with constants a,c > 0,d <0,0<b<1,and K > a/(1 — b). The ratio a/(1 — b) 
gives the unique fixed point (or equilibrium) y of the Hicks equation; for an 
explanation of the general Hicks model and the details of all derivations, see [6]. In 
particular, the negative number d is what Hicks calls the “floor level of induced 
investment.” It is shown in [6] that, under these hypotheses, every non-equilibrium 
solution of (3) executes bounded, non-decaying oscillations about the unstable (and 
non-attracting) fixed point, as is expected of the “business cycle.” But what 
happens when d approaches zero? It is in the limiting case d = 0 of the Hicks 
equation that the fixed point y turns into a global attractor, which is unstable if 


c>(1+vI—-b). (4) 


To see this, choose y_, = y and yy = y + &, where ¢ > 0 is small enough so that 
yy < K. Then the trajectory {y,} develops according to the linear difference 
equation 


Yn+1 =at (b + C) Vn ~ CVn-19 


which has exponentially divergent solutions, since condition (4) implies the exis- 
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tence of eigenvalues with magnitude greater than 1. Hence, y is unstable. Upon 
reaching K, however, the trajectory bounces down and obeys the first order 
equation 


Yn+ =art by,» 


whose solution clearly converges to y. Generalizing this argument to arbitrary pairs 
of initial conditions is not hard, and establishes that y is globally attracting. 
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UNSOLVED PROBLEMS 
Edited by Richard Nowakowski 


In this department the MONTHLY presents easily stated unsolved problems dealing with 
notions ordinarily encountered in undergraduate mathematics. Each problem should be 
accompanied by relevant references (if any are known to the author) and by a brief 
description of known partial or related results. Typescripts should be sent to Richard 


Nowakowski, Department of Mathematics & Statistics & Computing Science, Dalhousie 
University, Halifax NS, Canada B3H 3J5, rjn@cs.dal.ca 


Divisors and Desires 


Richard K. Guy 


Zhang Ming-Zhi, Lin Da-Zheng, and Wang Weh-Hui noted that, if d(7) and o(n) 
are the Euler totient function and the sum of divisors function, then it is 
well-known that 


o(n) + a(n) = 2n is a necessary and sufficient condition for n > 1 to be prime, 


and they asked if there were composite values of n for which é(n) + o(n) = kn. 
For k = 3 they found 312, 560, 588, 1400, 85632, 147492, 556160, 569328, 1590816, 
2013216, 3343776, 4695456, 9745728 and 12558912, while for k = 4 there are 
23760, 59400, 153720, and 4563000. 

Some questions that arise are: 


1. Are there infinitely many solutions for each k > 2? 
2. Is there an odd solution? 
3. All these solutions are of shape 4m. Is there a solution of shape 4m + 2? 


In partial answer to Questions 2 and 3, Zhang, Lin, and Wang show that an odd n 
must have at least 6 distinct prime factors, and that if k = 4 then n has at least 15 
such. They also show, parallel to Euler’s result on odd perfect numbers, that an 
odd solution is a perfect square and that a singly even solution must be of the form 
n = 2p%q*, where p is a prime with p = a = 1mod4 and gq 1 2p. 

We read in [1, §B41] that if we average the o- and ¢-functions, and iterate, 
then since ¢(7) is always even for n > 2 and o(n) is odd when n is a square or 
twice a square, we will sometimes reach a noninteger value. For example, 
54, 69, 70, 84, 124, 142, 143, 144,225; in this case we say that the sequence 
fractures. Another possibility, in view of the opening observation, is that such 
sequences can become constant, for example, 60, 92, 106, 107, 107,... . 


4. Are there such iterated sequences that increase indefinitely without fractur- 
ing? 


The arithmetic mean of these two functions is not always an integer. Their 
geometric mean is an integer for m = 1, 14,30,51,105,170,.... 


5. Are there infinitely many n such that the product $(n)a(n) is a perfect 
Square? 
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The product is exactly divisible by n when n = 1, 6, 18, 24, 28, 40, 54, 72, 84, 96, 117, 
120, 135, 162, 196, 200, 216, 224, 234, 252, 270, 288,.... The perfect and multi- 
perfect numbers are there, of course, but this doesn’t necessarily imply that there 
are infinitely many. But if we note that the odd powers of the perfect numbers are 
also present, we see that there are. 


6. What is the density of these numbers? 


Whenever n is prime, then ¢(1)o(n) is one less than a square. This also happens 
for n = 6, 22, 33, 44, 69, 76, 82,.... 


7. Are there infinitely many composite numbers of this kind? 
8. The product ¢(n)o(n) is constant for n = 55, 56 and 57. Are there 
infinitely many such triples? Can there be longer runs? 


Scott Forrest asks if, for every base D, there is a d-number, 1.c., a number n the 
sum of whose digits, when written in base b, is equal to the number, d(n), of its 
divisors. An example of a d-number in base 10 is 262144, since 


d(262144) = d(2®) =19=2+6+2+1+4+4+4. 
In fact they seem to be quite numerous, so we can ask more specifically: 
8. For each base is there always a two-digit d-number? 
The least ones for the first few bases are 
11, 22, 11, 13; 11, 12, 24, 26, 11,, 13,, 11, 15, 244, 
For some bases the smallest 2-digit d-number requires a somewhat larger first 


digit. Examples are 88,, and 84,,,; the largest that Forrest found with b < 2000 is 
10b + 2 for b = 1397 where d(13972) = d(2?- 7-499) = 12 = 10 + 2. 


9. Is the first digit of the smallest 2-digit d-number bounded independently of 
the base, or is it infinitely often greater than In D, for instance? 


Forrest also found the following numbers of d-numbers less than b>, i.e., with less 
than 6 base b digits: 


562 3 4 5 6 7 8 9 10 11 12 13 14 
# 7 26 63 125 220 588 563 1561 1533 2451-2095 6501 3487 


The first few entries in the table are remarkably close to b°, but probably this is a 
manifestation of the Strong Law of Small Numbers [2]. 


10. Are there infinitely many d-numbers for each base? 
11. If f(b, N) is the number of base b d-numbers less than N, can we find 
good upper and lower bounds for f(b, N’)? 


We are indebted to Andrew Bremner for confirming that the desires of the title 
are indeed base desires. 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 
with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, M. J. Pelling, Richard Pfiefer, Leonard Smiley, John Henry 
Steelman, Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before September 30, 1997; Additional information, such as generalizations and 


references, 1s welcome. The problem number and the solver’s name and address 
should appear on each solution. An acknowledgement will be sent only if a mailing 
label is provided. An asterisk (*) after the number of a problem or a part of a 
problem indicates that no solution is currently available. 


PROBLEMS 


10585. Proposed by Alta Kellogg, Ormond Beach, FL. A sequence do, a1, a2, ... of real 
numbers is called strictly totally positive (STP) if every submatrix of the Hankel matrix 
(ai+;)i,j>0 has positive determinant. 

(a) Show that the sequence Co, C1, C2, ... of Catalan numbers, defined by C, = aT 7"), 
is STP. | 

(b) Show that the sequence of Catalan numbers is minimal in the following sense: If 
Ag, a1, a2,... is an STP sequence of positive integers with a, < C, for every n, then 
Ay, = C,, for every n. 


10586. Proposed by C. D. Aliprantis, Indiana University—Purdue University, Indianapolis, 
IN. Let x and y be two nonnegative Lebesgue integrable functions on [0, 1] satisfying 


1 1 
| er O de> | et at, 
0 0 


1 ! 
| ty(the * dt > | tx(t)e* dt. 
0 0 


Show that 


10587. Proposed by Joaquin Gémez Rey, Madrid, Spain. Let K, be the complete graph 
on 2n vertices. Let P, be the probability that two random perfect matchings of K2, are 
disjoint. What is limy— +69 Pr? 


10588. Proposed by Marcin Mazur, The University of Chicago, Chicago, IL. Let Aj A2A3 
be atriangle. Fori = 1, 2,3, let Bj; be a point on side Aj4 1 Aj+2, where subscripts are taken 
modulo 3. 

(a) Show that |A; Bj41| + |B; Bi +41] = |Ai Bi+2| + |B; Bi+2| holds fori = 1, 2, 3 if and only 
if B; is the midpoint of Aj41Aj+2 fori = 1, 2, 3. 

(b) Show that | A; Bj41|+|Aj; Bj+2| = |B; Bj41| +|B; Bi+2| holds fori = 1, 2, 3 if and only 
if Bj is the midpoint of Aj+)Aj+2 fori = 1, 2, 3. 
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10589. Proposed by Tim Keller, Fair Oaks, CA. Fix n > 3 and let S be the set of positive 
integers congruent to 1 modulo n. A number m é S is called indecomposable if it is not the 
product of two smaller numbers in S. Problem 3 from the 1977 International Mathematical 
Olympiad asks for a number that can be expressed as the product of indecomposable numbers 
in more than one way. Show that the least such number is the product of two numbers each 
of the form k(k +n). 


10590. Proposed by Robb Muirhead, University of Michigan, Ann Arbor, MI and Stephen 
Portnoy, University of Illinois, Urbana, IL. Let X have a uniform distribution on the interval 
[0, 1] and let Nin. be the digit in the mth place to the right of the decimal point in X*. 

(a) Find limy—+oo P(Nnwm = 1) fori =0,1,2,...,9. 

(b) Characterize those functions k(m) for which limm-_— oo P(Nm,k¢m) = i) = 1/10 for 
i=0,1,2,...,9. 


10591. Proposed by John Lomont, University of Arizona, Tucson, AZ. Let {an}°° , be the 
sequence of real numbers defined by x / tanh~!(x) =1— yr Anx?", 
(a) Prove that 07°.) dn = 1. 

(b) Prove that a, > 0 for alln > 1. 

(c) Prove that a} > 3a2 > 5a3 > 7aq4 > 9as >---. 


SOLUTIONS 


Small Sets Meeting All Circles 


10345 [1993, 874]. Proposed by George Baloglou, SUNY College at Oswego, Oswego, NY, 
and Fred Galvin, University of Kansas, Lawrence, KS. Given a subset X C R one obtains 
a subset R? \ X? of the plane by removing those points both of whose coordinates are in X. 
If X 4 R, such a set always contains horizontal and vertical lines. 

(a) Find such a set X, of Lebesgue measure zero, for which R2 \ X? contains no circle. 
(b)* Is there such a set X, of Lebesgue measure zero, for which every connected subset of 
R? \ X? consisting of more than one point contains a horizontal or vertical line segment? 


Solution of (a) by Paul J. Szeptycki, Ohio University, Athens, OH. For any measure zero 
dense Gs set X C R, R? \ X? contains no circles. 
Let C C R* be acircle. Let 


C; = {(x, y) €C: x € X} and Co = {(x, y) EC: ye X}. 


Then both C; and C2 are dense G3’s relative to C. By the Baire Category Theorem, 
Ci NC 4G, but C] N C2 C X?. 


Solution of (b) by Randall Dougherty, Ohio State University, Columbus, OH. The following 
theorem settles (b) negatively since every set of positive Lebesgue measure includes a perfect 
set. (A set is perfect if it is closed and has no isolated points.) 


Theorem. If X 1s a set of real numbers such that R \ X includes a perfect set, then there is 
a Borel connected subset of R? \ X* consisting of more than one point that does not include 
a horizontal of vertical line segment. 


Proof. Let P be a perfect set included in R \ X; we may assume that P includes no interval 
(just reduce P if necessary). Fix a point z in P, and let D be a countable dense subset of P. 

Let B, be the subset of P x R consisting of those points (x, y) such that either x € D 
and y is rational, or x € P \ D and y is irrational. This is the Cantor tepee space of Knaster 
and Kuratowski (Example 129 in L. A. Steen & J. A. Seebach, Jr., Counterexamples in 
Topology, Holt, Rinehart and Winston, 1970), with the apex moved up to infinity. As in the 
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cited reference, one can show that any two points of B; on the same vertical line are in the 
same quasicomponent of B; that is, no relatively open and closed subset of B; contains one 
of the two points but not the other. Clearly B; does not include any vertical line segment. 

Let Bi = B; U{(x, z): x € P}. Then, since we have added at most one new point on 
each vertical line, Bi does not contain any vertical line segment; and it is still the case that 
any two points of B;, on the same vertical line are in the same quasicomponent of B/,, since 
(x, Z) is a limit of points (x, y) € By. 

Now let B, = {(y, x): (x, y) € Bi} Ge., the reflection of B) through the diagonal 
y = x) and B = B; UB... Then B is a Borel set that is included in (P x R) U (R x P) and 
hence does not meet X*. If S is a vertical line segment then, since P is closed and includes 
no interval, there must be a subsegment S’ of S that is disjoint from R x P and hence from 
B,. Since S’ is not included in B;, S is not included in B. A similar argument shows that 
B contains no horizontal line segments. 

Finally, we show that B is connected by showing that every point of B is in the same 
quasicomponent as the point (z, z), since this implies that every relatively open and closed 
subset of B is either B or 0, depending on whether or not it contains (z, z). If (x, y) € B}, 
then (x, y) and (x, z) are in the same quasicomponent of B; (and hence of B), while (x, z) 
and (z, z) are in the same quasicomponent of B,, (and hence of B), so (x, y) and (z, z) are 
in the same quasicomponent of B. The argument for (x, y) € B4 is similar. This completes 
the proof of the theorem. 


Editorial comment. Both Randall Dougherty and the proposers gave constructions in part 
(a) showing that it is possible for X to be an F, set of measure zero (in particular, a meager 
set). Dougherty also gave the following result, which may be interpreted as saying that 
many sets satisfy a weaker form of the property in (b). In particular, such sets would give an 
affirmative answer to a version of (b) in which “connected” is replaced by “path connected”. 


Theorem. If X is a comeager subset of R (1.e., R \ X is of the first category), then any 
closed connected subset of R? \ X? consisting of more than one point includes a horizontal 
or vertical line segment. 


The selected solvers solved both parts of the problem. In addition, part (a) was solved by A. N. ’t Woord (The Netherlands) 
and the proposers. 


Preserving Rationality without Being Completely Straight 


10361 [1994, 175]. Proposed by Emil A. Cornea, University of Bucharest, Bucharest, 
Romania, and Florin N. Diacu, University of Victoria, Victoria, B. C., Canada. Do there 
exist nonlinear C! functions f : R — R such that for any rational x, f(x) is also rational 
and for any irrational x, f(x) is also irrational? 


Editorial comment. Several readers noted that the function f(x) = x( 1 + |x| ~ has the 
required property. 

Most readers gave such examples, pieced together from fractional linear functions. Addi- 
tional smoothness requires a different construction. Some readers used functions defined by 
an infinite process starting from an enumeration of the rationals. While such constructions 
are more complicated, they can be used to construct examples that have such additional 
properties as monotonicity, convexity, or real analyticity. 

The ability to improve smoothness by using an inductive construction led Rick Mabry to 
question whether one could also destroy smoothness; that is, find an example in which f is 
a continuous nowhere differentiable function. Known examples of nowhere differentiable 
functions can be modified to take rationals to rationals, but it appears difficult to also assure 
that irrationals are taken to irrationals. 
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The following references to the examples in the literature were given: Rick Mabry 
cited F. S. Cater, Derivatives on countable dense sets, Real Analysis Exchange 11 (1985- 
6), 159-167; Frank Schmidt cited B. Neumann & R. Rado, Monotone functions mapping 
the set of rational numbers onto itself, J. Australian Math. Soc. 3 (1963), 282—287; and 
Dave L. Renfro cited Philip Franklin, Analytic transformations of everywhere dense point 
sets, Trans. Amer. Math. Soc. 27 (1925), 91-100 and W. D. Maurer, Conformal equivalence 
of countable dense sets, Proc. Amer. Math. Soc. 18 (1967), 269-270. 


Solved by J. Alvarez (Spain), G. Bartoszek & W. Bartoszek (South Africa), P. Budney, D. Callan, M. W. Cook, R. Ehren- 
borg (Canada), D. Fung, K. P. Hart (The Netherlands), R. Holzsager, R. B. Israel (Canada), F-A. Izadi (Iran), R. M. 
Lansangan, J. H. Lindsey II, O. P. Lossers (The Netherlands), R. Mabry, M. D. Meyerson, A. Nijenhuis, N. Passell, 
D. L. Renfro, K. Schilling, R. L. Schilling (Germany), F, Schmidt, G. L. Stanek, R. Tschiersch (Germany), R. B. Tucker, 
T. White, D. R. Witte & J. C. Lagarias, A. N. ’t Woord (The Netherlands), and the NSA Problems Group. 


Constancy and Commensurability 


10380 [1994, 363]. Proposed by Michael Slater, University of Bristol, Bristol, England. 
Suppose that f,..., fn are continuous real periodic functions, and that )7_, fi is a con- 
stant function, while no sum of fewer than n of the fj is a constant function. Show that the 
f; have a common period. 


Solution I by A. N. ’t Woord, University of Technology, Eindhoven, The Netherlands. Let 
c € R be the value of the constant function )>;_, fj. Rearranging the f;, we may assume 
that for 2 <i <r, f; and fj; have a common period and that forr <i <n, f; and f,; 
have no common period. Then fj, ..., f; have a common period, so that g = =] fj is 
periodic with period p, say. Forr <i <n, let p; denote a period of f;. Then p/pi ¢ Q 
forr <i <n. 

Let x € R. We show that g(x) = c — )7j_,4, fi(0). Let € > 0. Since the fj are 
continuous, there exists 6 > 0 such that |x — y| < 6 implies |g(x) — g(y)| < €/n and 
ly| < 6 implies | f;(0) — fiQ)| < €/n forr <i <n. Using Kronecker’s approximation 
theorem, we find k,k; € Zand y € R such that |y — x — kp| < 6 and |y — k; p;| < 6 for 
r<i<n.Now — 


g(x) — (.- > fi) 


g(x)-c+ D> fiM-gsiy)+e- dD) fi) 


i=r+1 i=r+1 i=r+l 
< |g) -g0)|+ D> [AM - f0| 
i=r+l1 
= |g(x) - g(y—kp)|+ D> |i - fio —kipi)| <«. 
i=r+l 


Hence g(x) = c — )i_,4; fi (0), so that g = )°)_, fj is a constant function. Therefore 
r =n and we conclude that the f; have a common period. 


Solution II by Richard Holzsager, The American University, Washington, DC. Being peri- 
odic, the functions f; are also almost periodic, so we may use the inner product 


t 
(fra) = fim =f fe@de(e dx. 
—f 
For any f, (1, f) is the average value of f. Replacing each fj by fj; — (1, f;) allows us 
to assume that each f; has zero average, without changing the outcome of the problem. 
Assuming that none of the fj is constant, {f;,..., f,} decomposes into equivalence classes 
consisting of functions sharing a common period. If we replace each equivalence class by 
its sum, we get a new family with the same properties, but with no two having the same 
period. 
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We show thatif f and g do not have acommon period (1.e., their periods are incommensu- 
rate), then (f, g) = (f, 1)(1, g). The result follows easily from this, because it implies that 
(fi. fj) = 0 fori # j,so00 = (fj,0) = (fi, > fi) = (fi, fi). But this is a contradiction 
since f; is not identically zero. 

Let a and b be the respective periods of f and g, with a and b incommensurate. For 
any t, let f‘(x) = f(x +1). Then (f,g) = (f", 9”) = (fm4-”, 2), the first equality 
holding because the functions are unchanged and the second because the inner product is 
translation invariant. Since a and b are incommensurate, any f¢ is the limit of numbers of 
the form ma — nb, and since periodic functions are uniformly continuous, it follows that 
f' is the uniform limit of functions of the form f”4—”, so (f, g) = ( fi, g) for all t. Then 
a(f,g) = fo (f', 2) =(Jfo f'. 8) = (af, 1), g) = a(f, 1)(, g). Cancelling the factor a 
gives the result we need. 


Editorial comment. An account of the properties of the inner product used in Solution II 
can be found in texts on harmonic analysis (e.g., see section VI.5 of Y. Katznelson, An 
Introduction to Harmonic Analysis, Dover, 1976). The example, f;(x) = sinx/3—sin x/5, 
fo(x) = sinx/5 — sinx/7, and f3(x) = sinx/7 — sin x/3, shows that the common period 
(2107 in this example) need not be the minimal period of any of the f;. 


Solved also by D. Beckwith, P. Budney, R. J. Chapman (U. K.), M. Golomb, L. E. Mattics, E. Wolf, and the proposer. 
Orthogonally Additive Functions on an Integer Lattice 


10381 [1994, 363]. Proposed by Marcin E. Kuczma, University of Warsaw, Warszawa, 
Poland. Determine all real valued functions f on the integer lattice Z* such that f(u-+-v) = 
f(u) + f (v) for every pair of orthogonal vectors u, v in Z?. 


Solution I by O. P. Lossers, University of Technology, Eindhoven, The Netherlands. It is 
clear that the set of functions f with this property forms a (real) vector space, V say, and 
that f(0) = 0. We show that a basis for this vector space is provided by the following 
functions: fi(x, y) =x, fo(x, y) = y, fa(x, y) =x*+ y’, and 


1 if x is odd and y is even; 
fa(x, y) = —1 if x is even and y is odd; 
OQ if x and y have the same parity. 


That these functions satisfy the condition is trivial for f; and fo. For fs, it follows from 
the Pythagorean theorem. For 4, one verifies directly that fa(x1, y1) + fa(x2, y2) = 
fa(x1 + x2, yi; + y2) for all pairs (x1, y;), (x2, y2) except those where either x; and x2 are 
both odd, or y; and y2 are both odd (but not both). Since the exceptional pairs can never be 
orthogonal, it follows that f4 € V. 

Now the functions f|, fo, f3, and 4 are clearly linearly independent, so it suffices to show 
that V is 4-dimensional. This follows from the fact thatany f € V iscompletely determined 
from its values on (1, 0), (—1, 0), (0, 1), and (0, —1). Indeed, we have f(x, y) = f(x, 0)+ 
f(O, y), f(2x,0) = f(x,x) + f(x, —x), and f(2x + 1,0) = fQx+1,1) —- fd, 1), 
while f(2x +1,1) = fa~%+1,x+1)+ f(@, —x). These equations (and similar ones for 
f (0, 2y), f(O, 2y + 1) and f(1, 2y + 1) ) show that, for any pair (x, y) with |x|+|y| > 1, 
f (x, y) can be expressed as a function of “smaller” (x, y)’s. 


Solution II, generalizing to n > 3, by Robert Patenaude, College of the Canyons, Valencia, 
CA. When the domain of f is Z” for n > 3, the condition f(u+v) = f(u) + fv) 
for orthogonal u and v becomes more restrictive. Let e;,...,e€, be the unit coordinate 
vectors in Z” and let f(ex) = ax, and f(—ex,) = by fork = 1,...,n. Applying the 
method of Solution I in the plane spanned by e; and e3 gives f(2e3) = a; + by + 243, 
while use of the plane spanned by e2 and e3 gives f(2e3) = az + bz + 2a3. Hence 


ay +b, = a2 + bo. By similar reasoning, a; + bj,..., a, + by, all have a common value, 
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say 2m. For each index k and any integer u, then, f(uxgex) = mur — muy + aguz. In 


general, f(u) = f(d>uxex) = > f (ucer) = m Y\(uz — uz) + agus, or in terms of the 
obvious column vectors, f(u) = m(u — 1)'u-+a!u. Conversely, for all such functions f, 
ftu+v) — fw) —- f(y) = m(uly + viu) = 2m(u- v) = 0 for orthogonal u and v, as 
required. 


Editorial comment. The only property of R required for the statement of the problem is that 
it have an operation denoted +. Thus, Aaron Meyerowitz and Fred Richman considered 
functions from Z” to an abelian group A. The space F,(A) of allowable functions then 
forms an abelian group under pointwise addition. The case of A = Z is paramount. The 
selected solutions are easily modified to obtain functions generating F,,(Z) as a free abelian 
group. For general A, F;,(A) consists of combinations of these functions with coefficients 
in A. 

Similar spaces of functions defined on a vector space with an orthogonality relation have 
been studied. Also, if A allows multiplication by 1/2, F,,(A) will be the direct sum of 
an even part (functions satisfying f(—u) = f(u) ) and an odd part (functions satisfying 
f(—u) = —f(u)). See J. Ratz, On orthogonally additive mappings, II, Publ. Math. De- 
brecen 35 (1988), 241-249 and J. Ratz and Gy. Szab6, On orthogonally additive mappings, 
IV, Aequationes Math, 38 (1989), 73-85 for related results from that theory. 


Solved also by D. Callan, R. J. Chapman (U. K.), M. Golomb, R. Holzsager, N. Komanda, J. H. Lindsey II, S. C. Locke, 
A. D. Meyerowitz & F. Richman, N. Passell, J. Ratz (Switzerland), M. Reid, R. Richberg (Germany), R. M. Robinson, 
K. Schilling, J. Spencer, T. White, A. N. ’t Woord (The Netherlands), and the proposer. 


A Matrix of Tangents 


10387* [1994, 474]. Proposed by Stanley Rabinowitz, Westford, MA and Peter J. Costa, 
University of St. Thomas, St. Paul, MN. Let T, = (¢;,;) be the n-by-n matrix with 
ti. j = tani + j — 1)x, Le., 


tan x tan 2x tan 3x wae tan nx 

tan 2x tan 3x tan 4x -»»  tan(n + 1)x 
T, = | tan 3x tan 4x tan 5x “++ ftan(n + 2)x 

tannx tan(n+1)x tan(n+2)x_ --- tan(2n —1)x 


Computer experiments suggest that det(7;,) equals 


n—1 r ay) . 
(—1)!"/2! sec” nx I] (sin?(n — r)x secrx sec(2n — rx) x | sinn 7 ifn odd, 
cosn“x ifn even. 


r=1 
Prove or disprove this conjecture. 


Solution by David P. Robbins, Center for Communications Research, Princeton, NJ. We 
evaluate a more general n-by-n determinant: 


det Ga + Y ‘) = 
Xj + Yk 
(a — b)"""(aJ] xj + (-1)" "OT ve) Miz jeren (yj — 40} — ¥e)) 
i(2j-1)0 


(1) 
Setting a = —i,b =i, xj =e , and y; = e—'(2j—-)@ in (1) gives the desired formula. 
To prove (1), note that the determinant has the form P / I] j , (Xj + ye), Where P isa 


polynomial that is homogeneous of degree n” in the x’s and y’s and homogeneous of degree 
n ina and b. Because the determinant vanishes when two of the x’s or two of the y’s are 
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equal, P is divisible by | |(x; — xz) | (yj — yx). If we set a = b, then all the entries become 
identical. Thus subtracting the first row from each of the other rows makes each of the other 
rows divisible by (a — b), and P is divisible by (a — b)"~!. 

Degree considerations show that the remaining factor has the form au + bv, where u and 
v are homogeneous of degree n in the x’s and y’s. Setting x; = O makes the first row of 
the determinant (and hence P) divisible by b; thus x; divides u, and u is a constant multiple 
of [| x;. Similarly, v is a constant multiple of [| y;. Finally, since the determinant must 
remain the same if we exchange a with b and the x’s with the y’s, we may conclude that 
v/T] yj = (-1)"'u/ Tx. 

Thus the left side of (1) is a constant multiple 4, of the right side. To find A,, we take 
the special case b = 0. Then we may cancel a” x) - - - x, from both sides, obtaining 


( 1 ) Ti<jeren(@yj — xe) 07 — Ye) 
det =), J 
xj + Yk TT} a1 rj + Ye) 


If we multiply both sides of (2) by x; and take the limits as x; — oo and then yj — oo, 
we find that A, = An—1. It is clear that A; = 1, and the proof is complete. 

The identity (2) (with A, = 1) appears in Thomas Muir, A Treatise on the Theory of 
Determinants, Dover, 1960, paragraph 353, page 348. 


(2) 


Editorial comment. H. van Haeringen proved another generalization, computing the de- 
terminant D,(a) = det (tan(a+i+ j — 1)x Vijev where a is any complex number. He 
showed that D,-;(a+1)Dy41(a—1) = Dy(a+1)D,(a—1)—(D, (a))? and also observed 


that H,,(a) = det ( (ati+j—1)7! i jai satisfies this same recurrence. 


Solved also by D. Callan, R. J. Chapman (U. K.), E. Fernandez Moral (Spain), H. van Haeringen (The Netherlands), and 
D. J. Wright. 


A Sure-Win Game of Solitaire 


10390 [1994, 574]. Proposed by Ognian Enchev, Boston University, Boston, MA. A standard’ 
deck of 52 playing cards is arranged at random in 4 rows and 13 columns. Show that with 
finitely many transpositions of cards of the same value (e.g., 7@ and 79, K> and K@, and 
so on) all cards can be arranged in such a way that each column contains one club, one 
diamond, one heart, and one spade. 


Solution I by John H. Smith, Boston College, Chestnut Hill, MA. Suppose that all suits occur 
in each of the first k — 1 columns, but that some suit y is missing from the kth column. By 
using at most k transpositions, we introduce suit y to the kth column without eliminating 
any suit from any of the first kK columns. 

Some suit x occurs at least twice in the kth column. Define a directed graph with 13 
edges whose vertices are the columns, as follows. For each value 7, draw an arrow from the 
column where value i occurs in suit x to the column where i occurs in suit y. Each of the 
first k — 1 columns has exactly one head and one tail, and column k has at least two tails 
and no head. 

Since each of the first k — 1 columns has one head and one tail, a path that enters this 
set must leave it. This yields a path from column k that eventually reaches a column later 
than k (it may go there directly). Making the transpositions corresponding to edges on this 
path introduces suit y to column k as desired. Applying this argument at most three times 
brings all suits into column k. The last column comes for free. Thus we can solve the whole 
problem in at most 3(1 + 2+.---+ 11) = 198 steps. 


Solution II by Gerry Myerson, Macquarie University, Sydney, NSW, Australia. Let C; be the 
set of values i such that a card of value i appears in column j. Since each set of k columns 
contains 4k cards, it contains cards of at least k values. Hence {C),..., C13} satisfies Hall’s 
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condition and has a system of distinct representatives, meaning that we can select cards of 
distinct values from the 13 columns. These cards have at least four cards in some suit; and 
using at most 9 transpositions, we can spread that suit over the 13 columns. Repeating the 
argument, we can correct a second suit using at most 13 — [13/3] = 8 transpositions and 
a third suit using at most 13 — [13/2] = 6 transpositions. The remaining suit is then also 
spread over the 13 columns, and we have solved the problem using at most 23 transpositions. 


Editorial comment. If the initial configuration consists of 12 columns that have only one 
suit and a 13th column having all suits, then each of the 12 bad columns must be involved 
in at least 3 transpositions, which establishes a lower bound of 18 transpositions. 

Hall’s theorem and related results can be found in many books on combinatorics. In 
particular, itis Theorem 1.1 in chapter 5 of H. J. Ryser, Combinatorial Mathematics, Carus 
Mathematical Monographs, no. 14, MAA, 1963. Readers also were able to obtain concise 
proofs of the result by using more specific theorems discussed later in that chapter. 

It is natural to generalize to m values and n suits. The method of Solution IT solves the 
game in at most mn — )°,_,, [m/k] steps, and the example described at the beginning of 
these comments leads to a lower bound of |m/n|n(n — 1)/2 steps. 


Solved also by G. Anderson & C. Anderson, D. Beckwith, R. J. Chapman (U. K.), P. Griffin, H. Helfgott, R. Holzsager, 
N. Komanda, J. H. Lindsey II, J. H. van Lint (The Netherlands), S. C. Locke, O. P. Lossers (The Netherlands), K. D. 
McLenithan & K. D. McLenithan, D. K. Nester, R. E. Prather, R. M. Robinson, N. Shazeer, T. Tran, C. Vanden Eynden, 
T. White, A. N. ’t Woord (The Netherlands), NSA Problems Group, Prague Problems Group (Czech Republic), WMC 
Problems group, and the proposer. 


Asymptotic Solution of a Recurrence Relation 


10403 [1994, 792]. Proposed by David Doster, Choate Rosemary Hall, Wallingford, CT. 
Define a sequence (y) recursively by yo = 1, y; = 3, and 


Yat = (2n + 3) yn — 2nyp—1 + 8n 
for n > 1. Find an asymptotic formula for y,. 


Solution by University of South Alabama Problem Group, Mobile, AL. Let xn = yy +2n+1. 
Then xo = 2, x; = 6, and 


Xn+1 — 2(n + 1)Xn = Xp — 2NXy-1 


forn > 1. Hence x, — 2nx,-| = 2 forn > 1. Now let z, = x,/(2"n!). Then zp = 2 and 
Zk — Ze-] = 2/(2*k!) for k > 1. Summing this from k = 1 tok =n, we see that 


Hence 


_ antl _ ONO NYS 
Yn = 2 ale —2n—1— > — Fa abe Dars 


Solved also by R. A. Agnew, E. S. Andersen & M. E. Larsen (Denmark), J. Anglesio (France), R. Bagby, R. Barbara 
(France), J. M. L. Besora (Spain), P. Bracken (Canada), A. Brown (Australia), M. Burger (Austria), N. P. Bhatia & 
W. O. Egerland, L. Cagliero & J. Lauret (Argentina), R. J. Chapman (U. K.), B. W. Conolly (U. K.), D. A. Darling, 
J. Davis, P. K. Desikan, R. S. Gautam (India), C. Georghiou (Greece), P. Griffin, J.-P. Grivaux (France), R. Holzsager, 
J. Howard, M. E. H. Ismail, P. G. Kirmser, B. G. Klein, K.-W. Lau (Hong Kong), G. N. Lewis, O. P. Lossers (The 
Netherlands), C. Mallinger (Austria), B. Margolis (France), H. Morris, I. Nemes (Austria), V. Novakov (Bulgaria), 
J. Ottenstein (Israel), M. A. Pinsky, H. Prodinger (Austria), M. Reid, N. C. Singer, A. Stenger, R. Stong, A. A. Tarabay 
(Lebanon), D. B. Tyler, J. H. van Lint (The Netherlands), M. Vowe (Switzerland), D. Zeitlin, NSA Problems Group, 
Prague Problems Group (Czech Republic), WMC Problems Group, and the proposer. 
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A Nonlinear Recurrence in Two Variables 


10408 [1994, 793]. Proposed by Peter W. Shor, AT&T Bell Labs, Murray Hill, NJ. Suppose 
that a function f is defined as follows: f(1, 1) = 1, fG, 0) =0 fori > 0, fC, 7) = 0 for 
j > 2, and 

fG, 7) =3Q2i-j -DfE-1LAN+G-274+3fC-LI-V) 
otherwise. phat | 
(a) Show that SHY! fi, f) =iG +20 +4) Gi -4). 
(b) Find a closed form expression for f(2j — 1, j). 


Solution to (b) by George E. Andrews, Penn State University, University Park, PA. With the 
standard convention that 1/n! = 0 if n is a negative integer, we show that 


0 a ifi <2j —2, 
fip=y Ht G4 j-VYAQi-j-D! 
(i —2j +2)'j — 1)! 
The three boundary conditions are immediate. As for the recurrence, 
322i -f —2fG-1j+@-27+3)f@-Lj-)) 
3itl—2j9/-i (2; — j — 2)! 
—  G—-27+2)!G -1)! 
_ 3itl-2727-1(2i — j — 212i - fj — G+) -1 
7 G —2j +2)'(j — 1! 
2!-4(3j —2)3j —3)! (37 — 2)! 

1'(j —1)! ~~ W-WG -— Dy 
Solution to (a) by National Security Agency Problems Group, Fort Meade, MD. Define 
g(i, j) = (—-1)t! fi , J). We start by proving the recurrence 

3917 — Agi, jf) +2721 + D(eG+1,7+2)-eG+1,j+D) 
+ 8g +2,j7+1)—-9gG4+2,7 +2) =0. 
Using the formula for f(i, 7) from part (b), we compute 
(—1)/(2i — f +13! 2st 
(i —2j +2) + D!2!-J 


otherwise. 


(2G +7 -—2G -27+2)4+3G4+7-3G-D) 
= f(i, J). 


Thus the answer to (b) is f(2j7 — 1, j) = 


894 +2, 7 +1) -9gG4+2,74+2) = A(i, j), 


where 
A@i, jf) =4Q2i -j+23G4+74+2VG+)+G+j+3)G -2j7+2)G -—2j +1). 

Also 

(—1)/+1(2i — jf — 1)13/-2-2 


— Bi, J), 


r+1,74+2)-e04+1,j74+)D= 
BOE NTF BOF NIE Ga DIG + DB 


where 
Bi, jf) =2G —27 +)G -2)G+j4+2)4+92i-PNG+V)Ga+j+1). 
Thus 
27(2i + 1)(gG —1, 7 +2)-eG@4+1,7+1) + 8g +2, 7 +1) — 9g 42, j +2) 
(93 . i—2j+1 
— Bi, j)Qi+ IG —2j +2)), 
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which reduces to 


_| j %—-j—-1 1gi-2j+1 
_ Doig parr 00 — 2)(3i+2)7G +1I)G+ 7-1) 


GaapEDIG - pa = 30 — a dD, 


= 3(9i* — 4) 


by polynomial arithmetic. | 

We have shown that the recurrence holds within the region where f(i, j) is nonzero. 
Using the formula for f (i, j), it is straightforward to show that the recurrence also holds at 
both boundaries (provided that we extend f and g by defining f(i, —1) = 0). 


Now let Si) = na g(i, j). Summing the recurrence for —1 < j < |i/2| +1 
yields 3(9i* — 4)S(i) — S(Gi + 2) = 0, and thus S(i + 2) = 3(3i — 2)(3i + 2)S(i). The 
initial values $(1) = f(1, 1) = 1 and SQ) = f(2, 1) — f(2, 2) = 2 agree with the result 
claimed. For i > 2, we have 
S(i — 2) 

i—2 


SG) = Gi — 8)3i — 4)3- SG —2) = (3i — 4)(3i — 6)(3i — 8), 


and the result follows by induction. 


Both parts were solved by both solvers and by the proposer. 


Multiple Floors and Ceilings 


10414 [1994, 912]. Proposed by R. J. Simpson, Curtin University of Technology, Perth, 
Australia, and W. F- Smyth, McMaster University, Hamilton, Ontario, Canada. For apositive 


real number x, let C(x) = [x/| Vx || + | x | and, for x > 1, let F(x) = |x /| Ax | | ae 
Vz 
(a) Express C(x) in a form that requires only one evaluation of a square root. 
(b) Express F(x) in terms of C(x). 
Solution to (a) by Allen Stenger, Gardena, CA. We prove that C(x) = [2./Tx] | for x > 0. 
Let n = | ./x |, which is fixed for (n — 1)* < x <n’. In this range, C(x) = [x/n] +n is 
integer-valued and nondecreasing, starting at 2n — 1 and ending at 2n. The switch occurs 
when 2n —1 = x/n+n, atx =n(n—1). Thus C(x) = 2n—1 for (n—1)* < x <n(n—1) 
and C(x) = 2n forn(n —1) <x < n*. 

Like C(x), the formula [2,/Tx] | is integer-valued and nondecreasing for (n — 1)* < 


x <n’, and it can change values only at integers. At n? — n, it equals | 2v n2 — n | = 
|v (2n — 1)? — 1] = 2n —1. Atn* —n+1/2, it equals |v (2n — 1)% + 3| = 2n. Thus 
the formula has the same value as C(x) everywhere. 


Solution to (b) by Robert A. Agnew, FMC Corporation, Chicago, IL. Suppose x > 1. Let 
m = |./x|, which is fixed for m? < x < (m+ 1)*. In this range (arguing as above), 


C(x) = F(x) = 
2m if x =m? 2m if m2 <x <m(m+1) 
2n+1 ifm? <x <m(m+1) 2m+1 ifm(nm+1)<x <m(m+2) 


Im+2 ifm(m+1) <x <(m+1)?; 2m+2 ifm(m+2) <x <(m+1)?. 


It follows that F(x) = C(x)—1 unless x € {m*, m*+m} or (m+1)*—1 <x < (m+1)’, 
in which case F(x) = C(x). 


Editorial comment. Robert A. Agnew provided a simple formula for F (x) analogous to that 


for C(x): F(x) = |2./[x]+1]. Other correct formulas for C(x) include [2./[2x7/2 | 
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(O. P. Lossers), | /4Tx] — I | (Albert Nijenhuis), and | 2([x2/[x]] + re) | (Western 
Maryland College Problems Group). 


Solved also by.D. Callan, J. Christopher, R. B. Eggleton, O. P. Lossers (Netherlands), D. K. Nester, A. Nijenhuis, 
A. A. Tarabay (Lebanon), NSA Problems Group, WMC Problems Group, and the proposers. 


An Odd Integral Sum 


10473 [1995, 745]. Proposed by Emre Alkan (student), Bosphorus University, Istanbul, 
Turkey. Prove that there are infinitely many positive integers m such that 


1 (7 Tas 
5.2m L4\ 2k 


Solution by Ulrich Abel, Fachhochschule Giessen-Friedberg, Friedberg, Germany. Let 
dm = 2-™ St (7% ")3*. By the binomial formula, 


is an odd integer. 


(1 +3) 4 (1 _ v3)" 


= A Beat Bes) 


2 

Thus the sequence satisfies a second order linear recurrence with characteristic roots 2+/3. 
This yields ay — 4am—1 + 4m—2 = O for m > 2. Using ag = 1 and a; = 5, we obtain 
the repeating sequence 1,5, 9, 1,5, 9, ... for the remainders modulo 10 of {a,,}. Thus each 
43n+ 1 1s an odd multiple of 5. 


Editorial comment. The Anchorage Solutions Group found the same sum in the College 
Mathematics Journal, Problem #428 [1990, 246; 1991, 257]. 


Solved also by E. S. Andersen & M. E. Larsen (Denmark), J. Anglesio (France), D. Beckwith, K. L. Bernstein, J. C. 
Binz (Switzerland, with generalization), G. Bouza Allende (Cuba), P. Bracken, S. Byrd, D. Callan, R. J. Chapman 
(U. K.), T. H. Crocker, Q. H. Darwish (Oman), J. S. Frame (with generalization), Z. Franco, S. M. Gagola Jr. (with 
generalization), C. Georghiou (Greece), N. H. Guersenzvaig (Argentina), J. K. Haugland (Norway, with generalization), 
Th. Honold (Germany), W. Janous (Austria, with generalization), M. S. Klamkin (Canada), W. Koepf (Germany), R. A. 
Kopas, Y. H. Kwong, F. Lengyel (with generalization), C. Libis (three solutions), J. H. Lindsey II, J. H. van Lint (The 
Netherlands), O. P. Lossers (The Netherlands), T. Mack, P. McCartney, A. Nijenhuis, P. Paule (Austria), A. Pedersen 
(Denmark), C. Popescu (Belgium, with generalization), O. A. Saleh & T. J. Walters (with generalization), L. Scribani 
(South Africa, with generalization), R. P. Sealy (Canada), J. Seibert (Czech Republic), H.-J. Seiffert (Germany, with 
generalization), L. W. Shapiro & P. Peart, A. Sinefakopoulos (Greece), J. H. Steelman, A. Stenger, I. Vardi, M. Vowe 
(Switzerland), A. N. ’t Woord (The Netherlands), Z. Wu, P. J. Zwier, Anchorage Solutions Group, GCHQ Problem 
Solving Group (U. K.), NSA Problems Group, Trinity University Problem Group, and the proposer. 


A Telescoping Sum 


10494 [1996, 74]. Proposed by WMC Problems Group, Western Maryland College, West- 
minster, MD. For each positive integer n, evaluate the sum 


yv'() / (7). 


Solution by Richard Holzsager, The American University, Washington, DC. The value is 
—1/(2n — 1). Let ["] = (52) /(2). For 1 < k <n, we compute 


n no) [ok,@n-2i4+1) [tT @n-214+1) 
|+ — kon T ke, 
k k—1 [ji - 1) [jai 2i - 1) 


_ 2n n _ 2n n+1 
 Ok-1/k-1!] 2nt+i1|] k fe 
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Thus 


2n 2n 4n — 1 22=! 2n—1 2n—1 

k _ ak 
Yd aaaee 1) (ae , }) 
4n—1 _ 1 


=2-—2 = — 
4n—2 2n —1 


Editorial comment. Solvers used a variety of methods, including induction, the beta integral, 
Gauss’s hypergeometric series summation, generating functions, and computer algebra. 
Several solvers noted that the identity appears as equation (4.26) on page 49 of H. W. 
Gould’s Combinatorial Identities (Morgantown, WV, 1972). 

Solved also by U. Abel (Germany), Z. Ahmed (India), E. S. Andersen & M. Larsen (Denmark), J. Anglesio (France), 
D. Beckwith, D. Bradley (Canada), J. T. Bruening, R. J. Chapman (U. K.), W. Chu & A. Marini (Italy), D. A. Darling, 
J.S. Frame, J. Grossman, M. Hoffman, A. Kaplan (France), C. Krattenthaler (Austria), J. H. Lindsey II, O. P. Lossers (The 
Netherlands), P. McCartney, R. Nelsen, J. H. Nieto, M. Petkovsek (Slovenia), H. Pollak, S. Radhakrishnan, O. A. Saleh 
& S. Byrd, L. Scribani (South Africa), H.-J. Seiffert (Germany), T. R. Shore, A. Sinefakopoulos (Greece), I. Sofair, 
J. H. Steelman, H. L. Stubbs, A. Tissier (France), T. V. Trif (Romania), R. B. Tucker, J. Van hamme (Belgium), M. Vowe 
(Switzerland), S. Wagon, H. S. Wilf, K. Williams (Canada), A. N. ’t Woord (The Netherlands), Z. Wu, NSA Problems 
Group, USA Problems Group, and the proposer. 


Boundedness Along Subsequences 


10500 [1996, 75]. Proposed by Jeffrey C. Lagarias and Peter W. Shor, AT&T Bell Laborato- 

ries, Murray Hill, NJ. Consider the following three properties that a sequence { f(n) : n = 

1,2,...} of real numbers may have. | 

(P1) The sequence {f(n) : n= 1,2,...} is bounded. 

(P2) For each real A > 1, the subsequence { f ({2*" }) :n=1,2,...}1s bounded. 

(P3) For each real A > 1, the subsequence { f ({A2" ]): n=1,2,...} is bounded. 
Obviously (P1) ==> (P2) and (P1) => (P3). What other implications hold, if any? 


Solution by Richard Holzsager, The American University, Washington, DC. Property (P2) 
is equivalent to (P1), but (P3) is strictly weaker. 

To see the first fact, we need to show that if a function f is unbounded on the integers, then 
it is unbounded on the sequence |2*" | for some A > 1. Choose 1 < nj < no < 13 <<... 
so that f(n;,) is strictly increasing to infinity. Let lg be the base 2 logarithm, and denote 
by J, the image under lglg of the interval [nz, ng + 1]. Note that each J, has length 
less than 1. Starting with J; = J;, inductively define a nested sequence {J,,} as follows: 
Given J, = [a, b], let m be an integer large enough so that m(b — a) > a+ 1. Choose 
k = ky41 so that I, = [c, d] satisfies c > ma. Enlarge m, if necessary, to myj+1 so that 
Mnt1a <c < (mya, + 1l)a. Thend <c+1 < myiyja+a+1 < my41), so the interval 
JIn+1 = Ip/mn+, is contained in the interior of J,,. The lengths of these intervals approach 
0, so thcir intersection consists of a single point t. Then, for every n, the multiple mt is in 
the interior of J, fork = k,,m = my. This is equivalent to nz < q2 ny +1. If we 
choose A = 2’, then |2*" | = nx, and f is unbounded on the sequence determined by 2. 


For the second claim, let f(n) = nifn = 22" _ 1, and O otherwise. By adding or 
deleting a finite number of terms of any sequence { | a2" | }. we can assume /2 <A < 2. 


Then for all k, we have 2 < n2* < 22". Furthermore, the difference between the last 
two quantities clearly increases with k, and is eventually greater than 1. This means that 


f (| a2") is eventually O for any A. 


Solved also by J. Merickel, GCHQ Problem Solving Group (U. K.), and the proposers. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


Table of Integrals, Series and Products: CD-ROM Version 1.0. By I. S. Gradshteyn and 
I. M. Ryzhik, edited by Alan Jeffrey. Academic Press, 1996, $79.95 


Reviewed by Jet Wimp 


I adore GRJ (Gradshteyn, Ryzhik, and Jeffrey), all 1200 pages of it. I have two 
copies: one for the office, one for home. If my upper body musculature were better 
developed, I would carry it to Europe with me, to Asia, to the Himalayas. Though 
many worthwhile identities are missing from GRJ, for instance those relating to 
differential and difference properties of hypergeometric functions, it can serve as a 
near equivalent to the combined five volumes in the Erdélyi set [1], [2]. It’s the 
ideal desert island book, though of course one has to get it there, and it may not be 
easy explaining to one’s ex-shipmates why this hefty and venerable tome should be 
given preference over that moldy goatskin flask of water when planning the 
contents of the lifeboat. 

Let me talk a bit about this volume, the book, not the CD [’m reviewing. 
Though called, modestly, Table of Integrals, Series, and Products, that doesn’t begin 
to do justice to its contents. The appellation “sums” covers closed-form expressions 
of frequently occurring sums along with much material on the convergence of 
infinite series. These topics, and products, functional series, asymptotic series, and 
formulas from differential calculus comprise the introductory chapter, the Oth 
chapter of the volume. Also materializing in this chapter are some of those hoary, 
cabalistic functions that used to inhabit our mathematical books but no longer do: 
the Gudermannian, Lobachevsky’s “angle of parallelism.” 

The first chapter is a treatment of elementary functions so comprehensive it will 
allow the owner to ditch all those tattered trig tables. The second, third, and fourth 
chapters deal with definite and indefinite integrals of elementary functions, and 
the fifth, sixth, and seventh deal with definite and indefinite integrals of special 
functions. Of course, to make the book self-contained, a discussion of special 
functions is required, and this the book has, in spades. Its eighth and ninth 
chapters contain a wonderful 200 page treatment of all the standard higher 
functions, and many of the results given there are absent from the Erdélyi volumes 
—indeed, absent from any mathematical treatment commonly available. The 
formulas are a lot of fun to read and one can, Jeopardy-like, shield from view the 
left hand side of an equation and ask what the right hand side represents, for 
instance: 
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ANSWER: 


The series 


where 


n 

k+1 
» (—-1) " Sk+1¢n—k 
k=0 


Cn4. =. 2 = 0,1,2,°", 
n+1 n+il 


with c) = 1,5, = y= .57721...,5, = E(n), n > 2. 
QUESTION: What is ['(z + 1)? 


There are several supplementary chapters that practicing mathematicians will 
consider pure gold: a chapter on vector field theory, one on algebraic inequalities, 
one on integral inequalities, a chapter on matrices, one on determinants, one on 
norms, one on ordinary differential equations, and, to cap it off, a chapter 
containing Mellin, Fourier, and Laplace transforms; this chapter is vestigial, 
though, and doesn’t offer a viable alternative to the standard tables [4], [5], [6]. 

The earlier editions had gobs of mistakes, but thanks to dedicated readers who 
have filed their suggested emendations with the editor, with the fifth edition most 
of the mistakes have been weeded out. I found one though (I almost had to justify 
my credentials for writing this review). The power B on the right hand side of the 
transformation formula for the Appell function F,, formula 3 on page 1083, should 
be —8. 

Where, in its comprehensiveness, does this volume stand in comparison to other 
tables? Well, it far surpasses the quaint handwritten two-volume 1965 table of 
integrals of Grobner and Hofreiter [3]. The transform sections, as pointed out, 
suffer in comparison to other tables. It certainly contains far less material than the 
mammoth five-volume set by Prudnikov and others with its 3500 pages of material 
[7]. The scope of [7], though, probably exceeds what anyone will ever require. The 
reader may know of the Borges short story, “The Library of Babel” in which the 
author envisions a library whose first room contains a hundred or so books, the 
first containing a single page with the single letter “A,” the second a page with the 
single letter “B,” etc. The second room of the library starts with a book containing 
a single page with the letters “AA,” followed by a book containing “AB,... .” I’m 
sure the reader gets the idea. Any desired text you want will be somewhere in that 
library; the problem is only one of information retrieval. The Prudnikov volumes 
come close to being a mathematical equivalent to Borges’ imaginary construct. 

And the presentation of the GRJ book: the binding, the appearance, the 
typography are all splendid. Remember the shabby photo offset reproductions of 
Russian books that held us in hostage twenty or so years ago—the malodorous 
books on oatmeal colored paper with the English intertext and the Cyrillic 
formulas? Nothing could be aesthetically more distant from these books than this. 
Academic Press has compiled a gorgeous volume. Naturally, I can think of things 
that should have been included that weren’t, but when I am searching for a 
formula vital to some research objective, it’s surprising how often it can be found in 
GRJ. 
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So what more remains to be said? Well, there’s the old joke about the play 
heavily revised in tryouts on Philadelphia and Boston, and the producer saying to 
the playwright after a disastrous Broadway opening night, “You just died from 
improvement.” Could this CD sound the death knell of a wonderful publishing 
concept? 

On the drawing board, it must have seemed like a marvelous idea. A hugely 
popular enchiridion: let’s make it available to anyone at the flick of a computer 
key. Let’s make a CD out of it! Academic Press enlisted the talents of Lightbinders 
of San Francisco, who produced the CD using the opulent and flexible text display 
software called DynaText 2.3. I was impressed with the software, which might be a 
wonderful way to render some books computer accessible. But here, no. What 
went wrong? The very worst thing that could go wrong. You can’t see the formulas! 
They are tiny, tiny, tiny. The linear notes for the CD cautioned that the reader 
should have Adobe File Manager to properly view the formulas. I ordered it. It 
helped not at all. One can do a mouse click in the upper right hand corner and 
things enlarge a bit, but not enough. 

I noticed how Academic, perhaps reacting to an increasingly litigious society, 
had included in the flyleaf of the original book a warning: 


Academic Press and the editor have expended great effort to ensure that the material presented 
in this volume is correct... . However...neither Academic Press nor the editor shall be 
responsible for any errors, omissions, or damages arising from the use of this volume. 


If this warning was ever warranted, it was in the liner notes of this CD. Damages, 
indeed. I estimate the chances of retrieving a correct formula from the screen 
display at about one out of three, that is, for anyone lacking microfocal vision. Now 
I was doing all this on a Mac platform: maybe someone’s DOS or Windows 
equation was lushly readable, elegant, utile. If so, I offer my humblest apologies to 
Academic; my hearty congratulations go to any such customers. They may imagine 
themselves fortunate but, computer karma being what it is, they’ll soon enough be 
victims to some other piece of flawed software. 

I phoned customer support at Lightbinders, and there followed one of those 
Kafkaesque conversations that seem to be so much a feature of the computer age. 
“You can display the TeX representation of an equation,” the consultant pointed 
out. “Then you can paste the equation into a TeX document.” But copying by 
highlighting the selection didn’t work. “Hmmmm,” the consultant said. “Do a copy 
from the file menu,” he ordered. It didn’t work. ““HHMMMM,” he said. “Do a 
copy by depressing the command and C keys.” No luck. “HHHMMMMMMMM,” 
he said. “That’s strange.” There I was, facing this travesty of the desired equation, 
Staring at me in its TeX metamorphosed form. Has the reader ever tried to 
reconstruct an equation from its TeX representation? Don’t. It’s easier to derive 
the equation. One of my favorite equations, a darling double integral for an Appell 
function, had become a sixteen-line porridge of “{}}}{}}(’s and “//\/\/\/’s. One 
can print the whole page containing the equation in question, but before one 
would do this, one would want to know what the equation says. The pericopes on 
notations and index of special functions look like so much flypaper. From the 
customer rep I got no satisfaction, and I was reminded of those primitive tribes 
where half the members speak one language, the other half another. 

What else could go wrong? Something else did. You can’t find things. The list 
of contents is too abbreviated, consisting only of the chapter headings in the book; 
further details require clicking the mouse on the displayed headings. Once again, a 
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Catch-22. You have to know where something is to find it. Where should you look 
for the definition of Euler’s constant? In the introduction? In the definition of 
elementary functions? In the integral tables? No, far away in Chapter 9, Special 
Functions, buried several layers deep. The table of contents of the book, the 
honest to god paper book, has to be visually scanned too, but it is all in front of 
you: nothing is buried under something else. After struggling with the CD-ROM, I 
clutched the book to my chest, praying it would never go away. 

Also, the software may have corrupted my system file, entailing a trek to the 
computer services center four city blocks away and, after I discovered the com- 
puter hardware was intact, a trek back to reinstall the system software. However, I 
make no accusations. Those ethereal and subtle software incompatibilities may be 
the only true things of the spirit, the only mysteries, vouchsafed to a technocratic 
society barreling into the 21st century. Let us revere them. 

I love books, but I’m not a Luddite. I embrace software that works. Macbeth 
talked about being yanked untimely from his mother’s womb, and I think that 
software—market pressure, no doubt—is often yanked untimely from the de- 
signer’s noggin. If you’re planning on adding this CD to your library, beware. Be 
certain your computing platform accommodates it in a way that makes it legible, 
convenient, and system compatible. Take nothing for granted. A hasty purchase 
means a wracking day on the phone to some callow customer rep in some distant 
part of the country, and no concomitant satisfaction. 


REFERENCES 


1. Erdélyi, A., et al., Higher Transcendental Functions, 3 vols., McGraw-Hill, New York, 1953. 

2. Erdélyi, A., et al., Tables of Integral Transforms, 2 vols., McGraw-Hill, New York, 1954. 

3. Grédbner, W. and N. Hofreiter, Integraltafel: erster Teil, Unbestimmte Integrale; zweiter Teil, 
Bestimmte Integrale, Springer-Verlag, Wien, 1965. 

Oberhettinger, F., Tabellen zur Fourier Transformation, Springer-Verlag, Berlin, 1957. 
Oberhettinger, F., Tables of Bessel Transforms, Springer-Verlag, New York, 1972. 

Oberhettinger, F. and L. Badii, Tables of Laplace Transforms, Springer-Verlag, New York, 1973. 
Prudnikov, A. P., Yu.A. Brychkov, and O. I. Marichev, Integrals and Series, 5 vols., Gordon and 
Breach, New York, 1986. 


MNS 


Department of Mathematics and Computer Science 
Drexel University 

Philadelphia, PA 19104 

jwimp @mcs.drexel.edu 


The Parsimonious Universe. By Stefan Hildebrandt and Anthony Tromba. Springer- 
Verlag New York, Inc., 1996, 344 pp., $32. 


Reviewed by Frank Morgan 


We are just beginning to understand how geometry rules the universe. We know 
that a round soap bubble has found the least-area way to enclose a given volume of 
air, but we do not know for sure whether the familiar double soap bubble provides 
the least-area way to enclose and separate two given volumes of air, despite the 
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much-publicized recent computer breakthroughs on the special case of equal 
volumes by Hass, Hutchings, and Schlafly [3], [4], [10]. The 100-year-old 
Kelvin Conjecture on the least-area way to partition space into equal volumes 
was disproved in 1994 by the new, less symmetric conjectured optimal 
structure of Weaire and Phelan [18], [14], [11], [9], computed 
to be about 0.3% better by Brakke’s Evolver (see Brakke’s home page at 
http: / /www.susqu.edu /facstaff /b /brakke /default.htm ). The analogous planar 
conjecture states that regular hexagons provide the least-perimeter way to parti- 
tion the plane into unit areas. Even this planar conjecture remains open today, 
despite abundant mistaken announcements and proofs, from H. Weyl’s Symmetry 
[19, p. 85] (originally published in 1952) to the book under review (p. 225). 

At the International Conference on Differential Geometry at IMPA in Rio de 
Janeiro, Brazil, July, 1996, Bessa and Jorge announced a proof of the Calabi 
Conjecture, that there are no complete bounded minimal surfaces in R°. Mean- 
while D. Hoffman and others continue to produce beautiful new examples of 
unbounded complete minimal surfaces; in Brazil, M. Traizet [20] explained how to 
get the latest by desingularizing families of intersecting vertical planes. 

The May, 1996 Euroconference on Foams in Arcachon, France brought to- 
gether mathematicians, scientists, engineers, and others to discuss applications of 
soap bubbles and foams to cell structures, wine, bread, cleansers, fertilizers, 
shaving cream, oil wells, mines, acoustics, drug delivery, fire extinguishers, nuclear 
safety, and concrete. 

Behind all the mathematics lie the human stories. Hass and Schlafly got their 
idea for a computer proof on the double bubble during a calm stretch between 
rapids while kayaking down the south fork of the American River in northern 
California. Their work depended on previous undergraduate research, as further 
developed by Hutchings, now a graduate student at Harvard. As for Lord Kelvin, 
we do not know exactly where he got his idea, but his niece reported [7, pp. 
46-47]: 


When I arrived here yesterday, Uncle William [Lord Kelvin] and Aunt Fanny met me at the 
door, Uncle William armed with a vessel of soap and glycerine prepared for blowing soap-bub- 
bles, on a tray with a number of mathematical figures made of wire. These he dips into the 
soap-mixture and a film forms or adheres to the wires very beautifully and perfectly regularly. 
With some scientific end in view he is studying these films. 


Weaire and Phelan found their better structure in nature as a certain “clathrate.” 
Brakke himself had searched long in vain, while all the time Linus Pauling’s text, 
The Nature of the Chemical Bond [13], with a picture on page 471 of the new 
clathrate structure, sat on a shelf above his head. 

The Parsimonious Universe by Hildebrandt and Tromba provides an exquisite 
liberal essay on the role of geometric minimization in the structure of the physical 
universe. Anyone familiar with the magnificent earlier version, published under 
the title Mathematics and Optimal Form (and reviewed by me in the Monthly [12)), 
will be amazed to find the new version a level still more attractive, with even more 
choice color illustrations, a still more appealing layout, and new material. Previ- 
ously inaccessible documents from the former East Germany include an early 
letter of Leibniz on the principle of least action and the Copernicus portrait at 
Leipzig. In a virtually unknown 1630 engraving, Queen Dido directs her attendants 
to cut an ox hide into thin strips and arrange them into a large circular enclosure 
of land, to take fullest advantage of the terms of her bargain. 
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Hildebrandt recently mentioned to me the question they address of “whether 
Newton had originally written his Principia in the language of infinitesimal calculus 
or not. The editor of Newton’s mathematical papers, Derek Whiteside, thinks the 
first, Michael Nauenberg from [UC] Santa Cruz, a well-known theoretical 
physicist..., is a recent strong proponent of the second opinion, and we have 
learned that an exciting discussion is taking place and might increase in the next 
years.” 

There is an illustrated account of H. Wente’s 1984 discovery of a closed 
immersed constant-mean-curvature surface, although nothing of more recent ex- 
amples of N. Kapouleas and others. 

A chapter on Soap Films includes a nice account of Jean Taylor’s mathematical 
derivation [17] of Plateau’s laws [15]: that soap films meet only in threes at 120 
degree angles along curves and that such curves meet only in fours at about 109 
degree angles at points. There is an untold story here worth adding. The first step 
of the proof is to classify all nine nontrivial nets of geodesics on the sphere 
meeting in threes at 120 degree angles. The second step is to show that all but two 
of them do not produce stable soap film cones. Lamarle [8], a contemporary 
mentioned by Plateau, took up the project but missed one of the nine cases. 
Aladar Heppes, unaware of Lamarle’s work, produced a complete proof but 
published only the first step [5]. Taylor [17], initially unaware of Lamarle or 
Heppes, repeated the task, but ironically made a mistake on one of the nine cases. 
John Sullivan [16] explained a beautiful dual approach to the first step. Heppes 
finally wrote up the second step of his proof [6]. When writing the paper, he 
checked Mathematical Reviews, discovered Taylor’s paper, and thence learned of 
Lamarle’s work for the first time. In 1994, I asked an e-mail correspondent in 
Budapest of the same name, “You are not related to the Heppes who classified 
geodesic nets on the sphere, so central to the classification of soap film singulari- 
ties, are you?” The reply: “The answer for your question is yes, I am related, and 
the relation is identity.” He finally met Taylor, Sullivan, and me at a special session 
on Soap Bubble Geometry, which I organized at the 1995 Burlington Mathfest. 

Although Hildebrandt and Tromba do not mention even the names of many 
contemporary scientists (a gap fortunately filled by the series on Mathematical 
People by Albers, Alexanderson, and Reid [1], [2]), they give marvelous accounts of 
historical figures from Archimedes to Newton to Max Planck. Planck defended 
Leibniz’s philosophy that our world is the best among all possible worlds in the 
face of world tragedy. The book ends with a touching postscript on Planck’s own 
personal tragedies: 


Planck’s oldest son Karl was killed at Verdun in 1916; both daughters died when giving birth to 
their first child; the younger son Erwin, because he had been involved in von Stauffenberg’s 
assassination attempt of July 20, 1944, was sentenced to death by Hitler’s criminal judge Freisler 
and was executed in January 1945. Shortly thereafter, Planck lost all his personal belongings, and 
at almost eighty-seven years of age he found himself in the great trek of refugees to the West, 
just like millions before and after him. 
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Only he who knows what mathematics is, and what its function in | 
our present civilization, can. give sound advice for the improvement — 
of.our: mathematical teaching. 


Hermann Weyl, Collected Works, Volume I (Opposite. Weyl photograph) 
Contributed by Hung-Hsi Wu, University éf California at Berkeley 
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Mathematics Appreciation, S(13—14). Short 
Stories From the History of Mathematics. 
Robert E. Knauff. Carolina Mathematics (Car- 
olina Biological Supply Co., 2700 York Road, 
Burlington, NC 27215), 1996, v + 99 pp, 
$10.50 (P). [ISBN 0-89278-435-0] 130 anec- 
dotes, many well-known (Indiana legislature 
and xz, origins of L'Hopital’s rule, incidents 
from the lives of various mathematicians). DB 


Precalculus, T(13: 1). Precalculus: Enhanced 
with Graphing Utilities. Michael Sullivan, 
Michael Sullivan, III. Prentice Hall, 1996, xxii 
+ 1054 pp. [ISBN 0-02-343742-1] Tradi- 
tional content, but emphasis and pedagogical 
approach influenced by technology. Includes 
open-ended problems, writing exercises, and 
collaborative projects. 


Precalculus, T(13: 1). Contemporary Precal- 
culus: A Graphing Approach, Second Edition. 
Thomas W. Hungerford. Saunders College, 
1997, xviii + 871 pp, $45. [ISBN 0-03-018544- 
0] New edition reflects changes in available 
technology and user feedback. (First Edition, 
TR, April 1995.) 


Precalculus, T(13: 1). A Graphical Approach 
to Precalculus. E. John Hornsby, Jr., Margaret 
L. Lial. HarperCollins, 1996, xxvi + 981 pp, 
$62.50. [ISBN 0-673-99966-1] Graphing 
calculator technology introduced in first chap- 
ter and used throughout. Includes group discus- 
sion activities and exercises “that make students 
aware of the connections between topics ....” 


Education, P, L. Windows on Mathematical 
Meanings: Learning Cultures and Computers. 
Richard Noss, Celia Hoyles. Math. Educ. Lib., 
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1-4; Semester 
** + Special Emphasis 
?? : Questionable 


V. 17. Kluwer Academic, 1996, xi + 275 pp, 
$130. [ISBN 0-7923-4073-6] A comprehen- 
sive foundation for a revolution intended to 
make visible (and meaningful) the pervasive 
but ‘dead’? mathematics “of the few,” that is 
designed “to facilitate the priorities of those 
who control it” by making its power “invisi- 
ble to the many.” Rooted in Seymour Papert’s 
vision of Logo as a children’s mathematics, this 
analysis offers computers as the salvation of 
mathematics—as a tool to make our mathemat- 
ical culture visible to the many. LAS 


Education, P, L. Assessment: Problems, De- 
velopments and Statistical Issues: A Volume 
of Expert Contributions. Eds: Harvey Gold- 
stein, Toby Lewis. Wiley, 1996, xvi + 273 pp, 
$74.95. [ISBN 0-471-95668-6] 15 chapters 
on all aspects of assessment: history, psycho- 
metric models, bias, comparison of institu- 
tions, educational testing, vocational assess- 
ment, international comparisons, and much 
more. Formula-free exposition broadens the 
potential audience to administrators and policy 
makers. LAS 


Combinatorics, T(16—17: 1), L. Introduction 
to Combinatorics. Martin J. Erickson. Ser. 
in Disc. Math. & Optim. Wiley, 1996, xii + 
195 pp, $59.95. [ISBN 0-471-15408-3] Main 
categories: existence (e.g., pigeonhole princi- 
ple, partial orders, Ramsey theory), enumer- 
ation (including tableaux, Polya theory), and 
construction (codes, designs). Open problems 
interspersed throughout text. LC 


Discrete Mathematics, T(15-16: 1), L. Ele- 
ments of Algebraic Coding Theory. L.R. Ver- 
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mani. Math. Ser. Chapman & Hall, 1996, viii + 
254 pp, (P). [ISBN 0-412-57380-6] Primary 
focus is group/linear codes. Linear algebra pre- 
requisite; abstract algebra concepts introduced 
as needed. Mathematically rigorous, but acces- 
sible to undergraduates. LC 


Discrete Mathematics, T(13—14: 1). Discrete 
Mathematics, Fourth Edition. Richard John- 
sonbaugh. Prentice Hall, 1997, xiv + 701 pp. 
[ISBN 0-13-518242-5] New to this edition: 
sections on binary and hexadecimal numbering 
systems, RSA public-key cryptosystems, more 
examples and exercises. (Second Edition, TR, 
May 1990.) TH 


Number Theory, T(17-18: 1), S, P, L*. Al- 
gorithmic Number Theory, Volume I: Efficient 
Algorithms. Eric Bach, Jeffrey Shallit. Found. 
of Comp. MIT Pr, 1996, xvi + 512 pp, $55. 
[ISBN 0-262-02405-5] A thorough introduc- 
tion to computational number theory. Covers 
GCD, calculations over finite rings and fields, 
primality testing. Many exercises. Extensive 
notes and bibliography. DB 


Group Theory, P. Modules and Group Alge- 
bras. Jon F. Carlson. Lect. in Math. Birkhauser 
Boston, 1996, xi + 91 pp, $26.50 (P). [ISBN 
0-8 176-5389-9] 

Calculus, T(14: 1), S. Functions of Two Vari- 
ables. Se4n Dineen. Chapman & Hall, 1995, 
x + 189 pp, $23.95 (P). [ISBN 0-412-70760- 
8] A gentle introduction to multivariable cal- 
culus. Discusses maximum and minimum prob- 
lems, plane curves, and integration theory on 
R*. Concise and informal exposition. AO 


Real Analysis, §(17), P. Advanced Analysis 
on the Real Line. R. Kannan, Carole King 
Krueger. Universitext. Springer-Verlag, 1996, 
ix + 259 pp, $45 (P). [ISBN 0-387-94642-X] 
Presents topics not usually covered in textbooks 
on measure theory (e.g., approximate conti- 
nuity, approximate differentiability, Hausdorff 
measure). Assumes knowledge of real analysis 
and some measure theory. PG 


Differential Equations, T(16-18: 1), L. Dif- 
ferential Equations and Dynamical Systems, 
Second Edition. Lawrence Perko. Texts in 
Appl. Math., V. 7. Springer-Verlag, 1996, xiv + 
519 pp, $49.95. [ISBN 0-387-94778-7] Nice 
text for a non-introductory course. Very clear 
coverage of linear systems. Discusses non- 
linear systems (local dynamics, global, and 
bifurcations) at a level between Strogatz and 
Guckenheimer—Holmes. Many useful exercises 
and solved examples. (First Edition, TR, March 
1992.) DK 


Differential Equations, T(16-17: 3). Mul- 
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tiple Scale and Singular Perturbation Meth- 
ods. J. Kevorkian, J.D. Cole. Appl. Math. 
Sci., V. 114. Springer-Verlag, 1996, viii + 
632 pp, $59. [ISBN 0-387-94202-5] Revised 
and updated version (including new material) 
of the author’s 1981 text Perturbation Methods 
in Applied Mathematics (TR, December 1981). 
Thorough survey of standard perturbation meth- 
ods. Key techniques motivated by examples; 
theory presented in detail. Topics include limit 
process expansions, method of multiple scales, 
averaging transformations. MPR 


Differential Equations, T**(14).Elemen- 
tary Differential Equations and Boundary Value 
Problems, Sixth Edition. William E. Boyce, 
Richard C. DiPrima. Wiley, 1997, xvi + 
749 pp. [ISBN 0-471-08955-9] Changes in- 
clude a substantially revised chapter on numer- 
ical methods, more emphasis on investigating 
how a solution depends on a parameter, and a 
greater emphasis on visualization. (Fifth Edi- 
tion, TR, January 1993.) AO 


Partial Differential Equations, P. Markov 
Processes and Differential Equations: Asymp- 
totic Problems. Mark Freidlin. Lect. in 
Math. Birkhauser Boston, 1996, vi + 153 pp, 
$34.95 (P). [ISBN 0-8176-5392-9] 


Partial Differential Equations, P. Weak 
and Measure-valued Solutions to Evolutionary 
PDEs. J. Malek, et al. Appl. Math. & Math. 
Comput., V. 13. Chapman & Hall, 1996, xi + 
317 pp. [ISBN 0-412-57750-X] 


Partial Differential Equations, P. Minimax 
Theorems. Michel Willem. Prog. in Nonlinear 
Diff. Eqts. & Their Applic., V. 24. Birkhauser 
Boston, 1996, viii + 162 pp, $49.50. [ISBN 
0-8 176-39 13-6] 


Partial Differential Equations, P. On Spec- 
tral Theory of Elliptic Operators. Yuri Egorov, 
Vladimir Kondratiev. Oper. Theory: Adv. & 
Applic., V. 89. Birkhauser Boston, 1996, x + 
328 pp, $151.50. [ISBN 0-8176-5390-2] 


Dynamical Systems, T(18: 3), P. Hysteresis 
and Phase Transitions. Martin Brokate, Jiirgen 
Sprekels. Appl. Math. Sci., V. 121. Springer- 
Verlag, 1996, x + 357 pp, $59.95. [ISBN 
0-387-94763-9] Hysteresis is the mathemat- 
ical study of nonlinear “looping” phenomenon 
found in diverse areas such as ferro-magnetism, 
thermostat design, super-conductivity, and spin 
glasses. Detailed coverage of hysteresis from 
a variety of perspectives. Includes an introduc- 
tion to the required mathematics. MPR 


Dynamical Systems, P. Dynamical Systems: 
Differential Equations, Maps and Chaotic Be- 
haviour. D.K. Arrowsmith, C.M. Place. Chap- 
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man & Hall, 1995, x + 330 pp, $57.95 (P). 
[ISBN 0-412-39080-9] Paperback republica- 
tion of 1992 edition (TR, June-July 1994). 


Numerical Analysis, C, P, L. Numerical Al- 
gorithms with Fortran. Gisela Engeln-Miillges, 
Frank Uhlig. Springer-Verlag, 1996, xxii + 
602 pp, $49.95, with CD-ROM, [ISBN 0-540- 
60529-0]; Numerical Algorithms with C, xxii 
+ 596 pp, $49.95, with CD-ROM. [ISBN 3- 
540-60530-4] Brief mathematical and com- 
putational descriptions of over 150 numer- 
ical algorithms together with programs im- 
plementing them. A translation, with a 
few updates, of the Seventh Edition (1993) 
of Numerik-Algorithmen mit FORTRAN 77- 
Programmen. AO 


Numerical Analysis, T*(16-17: 2), L. Numer- 
ical Analysis: Mathematics of Scientific Com- 
puting, Second Edition. David Kincaid, Ward 
Cheney. Brooks/Cole, 1996, xii + 804 pp, 
$64. [ISBN 0-534-33892-5] A well-written 
text. Includes some non-traditional topics (e.g., 
the multigrid method, delay differential equa- 
tions). Some sections have been rewritten and 
enlarged. ther changes include new problems, 
an updated bibliography, and an appendix with 
pointers to mathematical software. AO 


Numerical Analysis, T(15: 2), L. Practical 
Numerical Analysis. Gwynne A. Evans. Wiley, 
1995, xiii + 455 pp, $74.95. [ISBN 0-471- 
95535-3] Numerical techniques for a broad 
range of problems: solition of linear and nonlin- 
ear equations, ordinary and partial differential 
equations, integral equations; eigenvalue prob- 
lems; approximation theory; quadrature; opti- 
mization. Gives hints and answers for many of 
the exercises. AO 

Numerical Analysis, T*(15-17: 1), C, L. 
Fundamentals of Numerical Computing. L.F. 
Shampine, R.C. Allen, Jr., S. Pruess. Wiley, 
1997, x + 268 pp, $72.95. [ISBN 0-471-16363- 
5] An introduction to scientific computing re- 
quiring only calculus and modest programming 
experience. Topics reflect problems most com- 
mon in practice: solving systems of linear and 
nonlinear equations, interpolation, numerical 
integration, numerical solution of ODEs. Each 
chapter includes a case study illustrating the 
interplay of analysis and computation in the so- 
lution of real-world problems. AO 

Numerical Analysis, S(15-17), P*, L*. Af- 
ternotes on Numerical Analysis. G.W. Stewart. 
SIAM, 1996, x + 200 pp, $29.50 (P). [ISBN 
0-89871-362-5] Notes from an undergradu- 
ate course taught by a well-known expert; re- 
flects what was actually said in class each day. 
Topics include nonlinear equations, computer 
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arithmetic, linear equations, polynomial inter- 
polation, numerical integration, and numerical 
differentiation. Provides valuable insights for 
students and instructors. AO 


Operator Theory, P. The Asymptotic Be- 
haviour of Semigroups of Linear Operators. Jan 
van Neerven. Oper. Theory: Adv. & Applic., 
V. 88. Birkhauser Boston, 1996, xi1 + 237 pp, 
$122.50. [ISBN 0-8176-5455-0] 


Operator Theory, S(18), P, L. Fundamental 
Solutions for Differential Operators and Appli- 
cations. Prem K. Kythe. Birkhauser Boston, 
1996, xxi + 414 pp, $64.50. [ISBN 0-8176- 
3869-5] Covers over seventy linear and non- 
linear differential operators from mathematical 
physics, theory of elasticity, fluid dynamics, 
piezoelectrics, and cosmology. Assumes ad- 
vanced calculus, ordinary and partial differen- 
tial equations, complex analysis. KS 


Functional Analysis, T(18), 8, P, L?. Banach 
Spaces for Analysts. P. Wojtaszczyk. Stud. 
in Adv. Math., V. 25. Cambridge Univ Pr, 
1996, xiii + 382 pp, $34.95 (P); $69.50. [ISBN 
0-521-56675-4; 0-521-35618-0] Dense intro- 
duction. Emphasizes usefulness for harmonic 
analysis, complex function theory, orthonormal 
series, approximation theory, and probability. 
Lots of exercises, with hints and solutions. Ex- 
tensive bibliography. A useful, well-written 
guide. KS 


Functional Analysis, P. Function Spaces, 
Entropy Numbers and Differential Operators. 
David E. Edmunds, Hans Triebel. Tracts in 
Math., V. 120. Cambridge Univ Pr, 1996, xi + 
252 pp, $59.95. [ISBN 0-521-56036-5] 


Analysis, P. Geometry of Harmonic Maps. 
Yuanlong Xin. Prog. in Nonlinear Diff. Eqts. & 
Their Applic., V. 23. Birkhauser Boston, 1996, 
x + 241 pp, $79.50. [ISBN 0-8176-3820-2] 


Analysis, P. Ergodic Theory of Z4-Actions. 
Eds: Mark Pollicott, Klaus Schmidt. London 
Math. Soc. Lect. Note Ser., V. 228. Cambridge 
Univ Pr, 1996, viii + 484 pp, $44.95 (P). [ISBN 
0-521-57688-1] Proceedings of the 1993-94 
Warwick Symposium. 8 surveys and 12 re- 
search papers. 


Analysis, T(16-17: 1), L. Integral Transfor- 
mations, Operational Calculus, and General- 
ized Functions. R.G. Buschman. Math. & Its 
Applic., V. 377. Kluwer Academic, 1996, xili 
+ 231 pp, $117. [ISBN 0-7923-4183-X]_ Ele- 
mentary introduction to integral transforms and 
related topics; requires only calculus and linear 
algebra background. Topics include Laplace, 
Fourier, Mellin transforms; Mikusinski opera- 
tors; generalized functions. Proofs not stressed 
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but references are given. Computational exer- 
cises and more substantial problems. Unfortu- 
nate misspellings in “Contents.” JS 


Analysis, T(16-17: 1), S, L. Transform Meth- 
ods in Applied Mathematics: An Introduction. 
Peter Lancaster, Kestutis Salkauskas. Canadian 
Math. Soc. Ser. of Mono. & Adv. Texts. Wiley, 
1996, x + 332 pp, $59.95. [ISBN 0-471-008 10- 
9] Introduction to significant modern appli- 
cations of transform theory. Required back- 
ground is limited to calculus and linear algebra; 
includes two chapters on complex variables. 
Topics include the Laplace transform, Fourier 
methods, Z-transforms, discrete and continu- 
ous filters, distributions, and wavelets. JS 


Analysis, P. Regularization of Inverse Prob- 
lems. Heinz W. Engl, Martin Hanke, Andreas 
Neubauer. Math. & Its Applic., V. 375. Kluwer 
Academic, 1996, viii + 321 pp, $160. [ISBN 
0-7923-4157-0] 

Analysis, P. Algebraic K-Theory. Eds: Grze- 
gorz Banaszak, Wojciech Gajda, Piotr Krason. 
Contemp. Math., V. 199. AMS, 1996, xviii + 
210 pp, $49 (P). [ISBN 0-8218-0511-8] . Pro- 
ceedings of a 1995 conference at the Adam 
Mickiewicz University in Poznan, Poland. 


Algebraic Geometry, T(18: 1), P. Prolegom- 
ena to a Middlebrow Arithmetic of Curves of 
Genus 2. J.W.S. Cassels, E.V. Flynn. London 
Math. Soc. Lect. Note Ser., V. 230. Cambridge 
Univ Pr, 1996, xiv + 219 pp, $37.95 (P). [ISBN 
0-521-48370-0] Curves of genus 2 can be rep- 
resented as y* = a polynomial of degree 6. Most 
of what is known about them is general theory 
that is not very helpful when faced with a spe- 
cific curve. The authors begin filling in this 
gap by constructing Mordell—Weil groups and 
finding rational points. DB 


Algebraic Geometry, P. Complex Algebraic 
Surfaces, Second Edition. Arnaud Beauville. 
London Math. Soc. Stud. Texts, V. 34. Cam- 
bridge Univ Pr, 1996, ix + 132 pp, $22.95 (P); 
$59.95. [ISBN 0-521-49842-2; 0-521-49510- 
5] Reprint of the First Edition (TR, Novem- 
ber 1984), with a new appendix on complex 
surfaces. PG 


Algebraic Geometry, P. Spinning Tops: A 
Course on Integrable Systems. Michéle Au- 
din. Stud. in Adv. Math., V. 51. Cambridge U 
Pr, 1996, viii + 139 pp, $34.95. [ISBN 0-521- 
56129-9] 


Differential Geometry, T?(18: 1), P. Singular 
Semi-Riemannian Geometry. Demir N. Kupeli. 
Math. & Its Applic., V. 366. Kluwer Academic, 
1996, x + 177 pp, $99. [ISBN 0-7923-3996- 
7] Relatively self-contained study of smooth 
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manifolds furnished with degenerate (singular) 
metric tensors, leading from an intrinsic view to 
an extrinsic one. Topics include reasonable pre- 
liminaries, singular (and singular quaternionic) 
Kahler manifolds. RM 


Geometry, P. Geometric Applications of 
Fourier Series and Spherical Harmonics. H. 
Groemer. Ency. of Math. & Its Applic., V. 61. 
Cambridge Univ Pr, 1996, xi + 329 pp, $59.95. 
[ISBN 0-521-47318-7] Proves results from 
geometry (mostly from convex set theory) us- 
ing Fourier series and spherical harmonics. In- 
cludes background material on convex sets, 
analysis; develops necessary results on Fourier 
series, spherical harmonics. LC 


Algebraic Topology, P. Ends of Complexes. 
Bruce Hughes, Andrew Ranicki. Tracts in 
Math., V. 123. Cambridge Univ Pr, 1996, xxv 
+ 353 pp, $64.95. [ISBN 0-521-57625-3] 


Topology, T(15), L. Knot Theory and Its Ap- 
plications. Kunio Murasugi. Transl: Bohdan 
Kurpita. Birkhauser Boston, 1996, 341 pp, 
$69.50. [ISBN 0-8176-3817-2] Introductory 
treatment. Little background assumed, so no 
mention of algebraic topology. Plenty of pic- 
tures and exercises. Several sections show ap- 
plications to science. JD 


Topology, T(17-18: 1). Topology Via Logic. 
Steven Vickers. Tracts in Theoret. Comp. Sci., 
V. 5. Cambridge Univ Pr, 1996, 200 pp, 
$24.95 (P). [ISBN 0-521-57651-2] Paper- 
back republication of 1989 edition (TR, March 
1990). Written for computer scientists. 


Operations Research, T(17: 2), P, L*. Monte 
Carlo: Concepts, Algorithms, and Applica- 
tions. George S. Fishman. Ser. in Oper. Res. 
Springer-Verlag, 1996, xxv + 698 pp, $69. 
[ISBN 0-387-94527-X] A comprehensive in- 
troduction with chapters on estimating volume 
and count, generating samples, increasing ef- 
ficiency, random tours, designing and analyz- 
ing sample paths, and generating pseudorandom 
numbers. AO 


Operations Research, T?(16: 1), C.  Jn- 
troduction to Practical Linear Programming. 
David J. Pannell. Wiley, 1997, xiii + 333 pp, 
$59.95, with disk. [ISBN 0-471-51789-5] 
Discusses model construction, interpretation 
of software output, sensitivity analysis, etc. 
Nonmathematical—the simplex algorithm is 
not presented. AO 

Optimization, T(17: 1), C. A First Course 
in Optimization Theory. Rangarajan K. Sun- 
daram. Cambridge Univ Pr, 1996, xvii + 357 pp, 
$74.95; $27.95 (P). [ISBN 0-521-49719-1; 
0-521-49770-1] Blends theory and economic 
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applications. Major topics: existence of solu- 
tions in R” (e.g., theorems of Weierstrass and 
Lagrange, Karush—Kuhn-Tucker conditions); 
parametric variation of solutions; finite- and 
infinite-horizon dynamic programming. AO 


Optimization, T(16-17: 1), C. Linear and In- 
teger Programming: Theory and Practice. Ger- 
ard Sierksma. Pure & Appl. Math., V. 198. 
Marcel Dekker, 1996, xiv + 673 pp, $175, with 
disk. [ISBN 0-8247-9695-0] Covers the sim- 
plex method, duality and sensitivity analysis, 
basic solution techniques for integer program- 
ming problems, and the interior path version 
of Karmarkar’s interior point method. Includes 
a chapter on model building with case studies; 
proofs of major results. Exercise solutions are 
in an appendix. AO 


Optimal Control, T(18: 1), P. Operator Ap- 
proach to Linear Control Systems. A. Chere- 
mensky, V. Fomin. Math. & Its Applic., 
V. 345. Kluwer Academic, 1996, xvi + 396 pp, 
$198. [ISBN 0-7923-3765-4] Develops oper- 
ator theoretic approach to LQP (optimization 
problems for linear control with quadratic per- 
formance index). Extends classical theory to 
cases with distributed parameter plant described 
by functional (time delay) or partial differential 
equations. The linear control plant is treated as 
an unbounded linear operator in Hilbert space 
with time structure. RM 


Probability, T(18: 1), P. Lévy Processes. Jean 
Bertoin. Tracts in Math., V. 121. Cambridge 
Univ Pr, 1996, x + 265 pp, $54.95. [ISBN 0- 
521-56243-0] Concise overview of stochas- 
tic processes with independent and stationary 
time increments (random walks in continuous 
time) within Euclidean framework, using ana- 
lytic tools (Fourier, Laplace transforms). These 
include Poisson processes, Brownian motion, 
stable processes, and serve as prototypes of 
Markov processes. RM 


Stochastic Processes, P. Continuum Percola- 
tion. Ronald Meester, Rahul Roy. Tracts in 
Math., V. 119. Cambridge Univ Pr, 1996, x + 
238 pp, $49.95. [ISBN 0-521-47504-X] 


Mathematical Statistics, T(17: 1). A Course 
in Large Sample Theory. Thomas S. Ferguson. 
Texts in Stat. Sci. Chapman & Hall, 1996, ix 
+ 245 pp, $34.95 (P). [ISBN 0-412-04371-8] 
Covers basic probability, laws of large num- 
bers, central limit theorems, Slutsky theorems, 
asymptotic theory and distributions of quan- 
tiles, maximum likelihood estimates, likelihood 
ratio statistics, and posterior distributions. Con- 
tains many examples and exercises with detailed 
solutions. RS 
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Statistical Methods, T(18), P. Smoothing 
Methods in Statistics. Jeffrey S. Simonoff. Ser. 
in Stat. Springer-Verlag, 1996, xi1 + 338 pp, 
$54.95. [ISBN 0-387-94716-7] Discusses a 
variety of methods for nonparametric density, 
distribution, and regression function estimation. 
Includes older as well as recently developed 
techniques. Highlights useful and promising 
methods for scientists who analyze data from a 
practical standpoint. Most topics motivated by 
data examples. Computational issues section 
lists sources of code for methods. RS 


Statistical Methods, P. Advances in Biometry: 
50 Years of the International Biometric Society. 
Eds: Peter Armitage, Herbert A. David. Ser. 
in Prob. & Stat. Wiley, 1996, xii + 473 pp, 
$59.95. [ISBN 0-471-16018-0] 21 articles 
covering a wide range of topics including math- 
ematical and statistical techniques, methodol- 
ogy, and applications in various fields of biol- 
ogy and medicine. RS 


Statistical Methods, T(17-18: 1, 2), P. Meth- 
ods and Applications of Linear Models: Regres- 
sion and the Analysis of Variance. Ronald R. 
Hocking. Ser. in Prob. & Stat. Wiley, 1996, xxii 
+ 731 pp, $69.95. [ISBN 0-471-59282-X]__ In- 
troduces linear model theory then discusses lin- 
ear regression models and analysis of variance 
models. Graphical examples used throughout to 
illustrate and simplify theory. Presents classical 
methods as well as new and developing tech- 
niques (e.g., Hocking’s AVE method for mixed 
effects models). RS 


Statistical Methods, P. Adaptive Sampling. 
Steven K. Thompson, George F. Seber. Ser. 
in Prob. & Stat. Wiley, 1996, xi + 265 pp, 
$54.95. [ISBN 0-471-55871-0] Assumes 
some knowledge of sample survey theory. Dis- 
cusses a wide range of methods including re- 
cent developments. Worked examples are given 
throughout to illustrate theory and methods. RS 


Mathematical Computing, T*(15-16), S. 
Maple: A Comprehensive Introduction. Roy 
Nicolaides, Noel Walkington. Cambridge Univ 
Pr, 1996, xix + 466 pp, $39.95. [ISBN 0-521- 
56230-9] An introduction to Maple at a some- 
what deeper level than the average “Intro to 
Maple” text. Emphasizes why and how Maple 
works, not just what it does. Aims to provide 
the reader with a solid foundation in Maple as 
a computer algebra system and a programming 
language. Appears to succeed admirably. Cov- 
ers Maple V Release 4. MPR 

Mathematical Computing, P, C. Computa- 
tion of Special Functions. Shanjie Zhang, Jian- 
ming Jin. Wiley, 1996, xxvi + 717 pp, $89.95, 
with disk. [ISBN 0-471-11963-6] Software 
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for evaluating special functions (or some prop- 
erty of them, such as their derivatives, zeros, co- 
efficients of certain expansions, etc.) that arise 
in engineering or the sciences. LC 
Mathematical Computing, S(15—17), P*, L*. 
Introduction to Maple, Second Edition. André 
Heck. Springer-Verlag, 1996, xx + 699 pp, 
$39.95. [ISBN 0-387-94535-0] Revised and 
updated to reflect new mathematical features 
and improvements in Maple V Release 4. (First 
Edition, TR, March 1994.) AO 


Computer Science, T(16-17: 1, 2), L. Foun- 
dations for Programming Languages. John 
C. Mitchell. Found. of Comp. Ser. MIT Pr, 
1996, xix + 846 pp, $60. [ISBN 0-262-13321- 
0] Modern, rich, and encyclopedic treatment 
of the logico-mathematical foundations of pro- 
gramming languages. Stresses analysis of syn- 
tactic, operational, and semantic properties via 
a series of typed lambda calculi. Useful as an 
introduction to the foundations of languages, 
semantics, or type theory; deep but accessible 
to good undergraduates. RM 


Applications (Economics), P. Mathematical 
Models in Finance. Eds: S.D. Howison, EP. 
Kelly, P. Wilmott. Chapman & Hall, 1995, viii 
+ 152 pp, $60.50. [ISBN 0-412-63070-2] 13 
papers from a Royal Society of London dis- 
cussion meeting. Subjects range from classical 
theory to recent research results. 


Applications (Economics), P, C. Computa- 
tional Economics and Finance: Modeling and 
Analysis with Mathematica. Ed: Hal R. Varian. 
Springer-Verlag, 1996, xiv + 468 pp, $54.95, 
with disk. [ISBN 0-387-94518-0] 16 articles 
illustrate use of Mathematica in economics, fi- 
nance, and statistics. Diskette contains related 
Mathematica notebooks and packages. 


Applications (Economics), T(14: 1), S. Math- 
ematics for Economics and Finance: " Meth- 
ods and Modelling. Martin Anthony, Norman 
Biggs. Cambridge Univ Pr, 1996, 394 pp, 
$24.95 (P); $75. [ISBN 0-521-55913-8; 0-521- 
55113-7] An introduction to calculus (single- 
and multivariable) and linear algebra in the con- 
text of economics. Each chapter organized as 
narrative exposition, worked examples, sum- 
mary (list of main topics, key terms, notations, 
and formulae), and exercises. AO 
Applications (Economics), P. An Introduction 
to Bayesian Inference in Econometrics. Arnold 
Zellner. Classics Lib. Ed. Wiley, 1996, xv 
+ 431 pp, $39.95 (P). [ISBN 0-471-16937-4] 
Paperback republication of 1971 edition (TR, 
April 1973). 

Applications (Economics), T(17: 1), P. In- 
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troduction to Stochastic Calculus Applied 
to Finance. Damien Lamberton, Bernard 
Lapeyre. Transl: Nicolas Rabeau, Francois 
Mantion. Chapman & Hall, 1996, xi + 185 pp, 
$45.95. [ISBN 0-412-71800-6] Applications 
of probabilistic techniques to financial model- 
ing. Presents discrete-time models, the Black— 
Scholes model, interest rate models, and sim- 
ulation techniques. Assumes background in 
measure-theoretic probability. AO 


Applications (Mechanics), P. Lecture Notes in 
Control and Information Sciences—220: Nons- 
mooth Impact Mechanics: Models, Dynamics 
and Control. Bernard Brogliato. Springer- 
Verlag, 1996, xv + 400 pp, $76 (P). [ISBN 
3-540-76079-2] 


Applications (Quantum Theory), P. The In- 
famous Boundary: Seven Decades of Heresy 
in Quantum Physics. David Wick. Springer- 
Verlag, 1995, xvii + 310 pp, $19 (P). [ISBN 
0-387-94726-4] Paperback republication of 
1995 Birkhauser Boston edition (TR, March 
1996). 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
217: Robust Control via Variable Structure and 
Lyapunov Techniques. Eds: Franco Garofalo, 
Luigi Glielmo. Springer-Verlag, 1996, xxi + 
307 pp, $63 (P). [ISBN 3-540-76067-9] Pa- 
pers froma 1994 IEEE Workshop in Benevento, 
Italy. 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
221: Control of Nonlinear Multibody Flexible 
Space Structures. Atul Kelkar, Suresh Joshi. 
Springer-Verlag, 1996, xiv + 141 pp, $43 (P). 
[ISBN 3-540-76093-8] 


Applications, P. Computational Psycholin- 
guistics: AI and Connectionist Models of Hu- 
man Language Processing. Eds: Ton Dijkstra, 
Koenraad de Smedt. Taylor & Francis, 1996, xv 
+ 437 pp, $29.95 (P). [ISBN 0-7484-0466-X] 


Applications, T(13: 1). Early Astronomy. 
Hugh Thurston. Springer-Verlag, 1994, x + 
268 pp, $29.95 (P). [ISBN 0-387-94107-X] A 
very enjoyable account of early developments 
in astronomy. PF 


Reviewers 


DB: David Bressoud, Macalester; LC: Laura Chihara, 
St. Olaf; JD: Jill Dietz, St. Olaf; PF: Paul Froeschl, 
Macalester; PG: Philip Gloor, St. Olaf; TH: Tom 
Halverson, Macalester, DK: Danny Kaplan; Macalester; 
RM: Richard Molnar, Macalester; AO: Arnold Ostebee, 
St. Olaf; MPR: Matthew P. Richey, St. Olaf; KS: Karen 
Saxe, Macalester; JS: John Schue, Macalester; RS: Richard 
Single, St. Olaf; LAS: Lynn Arthur Steen, St. Olaf. 


385 


THE AUTHORS 


A. J. BERRICK was educated at the University of Sydney and at Oxford University where 
he became a fellow of St. John’s College. After lecturing at Imperial College, London, he 
joined the National University of Singapore’s Mathematics Department, where he is now a 
professor. His research publications have mainly been in algebraic topology, cohomology of 
groups, and algebraic K-theory, the subject of his Pitman book. 


M. E. KEATING obtained his B.Sc. at University College, London, and his Ph.D. at King’s 
College, London. He then joined the Mathematics Department at Imperial College, 
London, where he has been a permanent fixture apart from sabbaticals at McGill Univer- 
sity, Montreal, and the National University of Singapore. His research interests center on 
the relationships between number theory, noncommutative ring theory, and K-theory. 
While Berrick and Keating were together in Singapore, they started a long-term collabora- 
tion to produce graduate level texts on K-theory and its prerequisites, which should soon 
bear fruit in the Cambridge University Press. 


MAARTEN BOERLIJST, born 1963, received his Ph.D. in 1994 with Pauline Hogeweg in 
Utrecht (the Netherlands). His research interests include prebiotic evolution, spatial models 
in ecology and evolution, theoretical immunology, virus dynamics, and evolutionary game 
theory. He is now a “European Commission Human Capital and Mobility Fellow” or 
“Flying Dutchman” with the Mathematical Biology group in Oxford. He enjoys outdoor 
sports, including sailing, horseback riding, and travelling. 


MARTIN NOWAK was born in Vienna in 1965, met Karl Sigmund in 1987, and became 
immediately engaged in an (almost) endlessly repeated Prisoner’s Dilemma. Nowak eventu- 
ally escaped to Oxford, where he is Head of the Mathematical Biology group in the Zoology 
Department. His research interests range from mathematical models of evolutionary dynam- 
ics to virus infections and the immune system. 


KARL SIGMUND, born 1945, is professor of mathematics at the University of Vienna. His 
interests include evolutionary game theory, dynamical systems, and (more recently) the 
history of the Vienna Circle. His main attempt at popular science is the book Games of Life. 


JIM SAUERBERG grew up in northern Wisconsin (Twins), and obtained a B.S. from the 
University of Wisconsin at Madison (Brewers) and a Ph.D. from Brown University (Red 
Sox). He then went to Union College in Schenectady, New York (None nearby—argh!), and 
has recently moved to St. Mary’s College in the Bay Area (Giants and A’s). He manages to 
do some number theory between innings. 


LINGHSUEH SHU received degrees in mathematics from National Central University in 
Taiwan and from Brown University, and has since taught at the Ohio State University and 
the University of Vermont. Although her expertise is in number theory, and she enjoys all 
types of mathematics, she is beginning to suspect that her true love is pottery. 


DAN VELLEMAN received his bachelor’s degree from Dartmouth College in 1976 and his 
doctorate from the University of Wisconsin in 1980. He taught at the University of Texas 
and the University of Toronto before joining the faculty of Amherst College in 1983. He is 
interested in logic, the philosophy of mathematics, and the foundations of quantum 
mechanics. When he isn’t doing mathematics he enjoys bicycling and singing. 


386 THE AUTHORS [April 


ALFRED M. BRUCKSTEIN was born in Transylvania, then moved to Israel, and is now at 
Bell Laboratories for an extended sabbatical from the Technion in Haifa, where he is 
Professor of Computer Science. He became interested in modeling ants and their trails 
while reading “Surely You are Joking, Mr. Feynman.” He enjoys designing logos and, most 
of all, playing with his son, Ariel. 


COLIN L. MALLOWS has recently left Bell Labs (which is now part of Lucent Technolo- 
gies) and joined the new AT & T Laboratory, where he plans to continue to try to make 
statistics seem less like sorcery. 


ISRAEL A. WAGNER is a research staff member at IBM Haifa Research Laboratory, where 
he creates and chases bugs in VLSI circuits for high-end processors. He is also working on a 
Ph.D. thesis, trying to show that many small-scale agents can cooperate to achieve large-scale 
targets. 


RICHARD STANLEY received his Ph.D. from Harvard University under the direction of 
Gian-Carlo Rota (at M.LT.). He was a Moore Instructor at M.I.T. in 1970-1971 and a 
Miller Research Fellow at Berkeley in 1971-1973. In 1973 he returned to M.I.T. where he is 
now Professor of Applied Mathematics. Stanley’s main research interest is combinatorics, 
especially its connections with algebra, geometry, and topology. He has long been interested 
in classical history and is pleased to have the opportunity to combine his interest in this area 
with that of mathematics. 


GERT ALMKVIST received his Ph.D. from UC Berkeley in 1966. He has been at University 
of Lund since 1967 with visits to Berkeley and Urbana. His main interests are in algebraic 
K-theory, invariant theory, and analytic number theory. He is the founder of the Institute of 
Algebraic Meditation. He enjoys table tennis (once commutative algebraic world champion), 
tennis, and hiking. 


MARTIN L. JONES is Associate Professor of Mathematics at the University of Charleston, 
South Carolina where he has been since receiving his Ph.D. in mathematics from the 
Georgia Institute of Technology in 1989. He has recently returned from spending a year as a 
Fulbright Scholar at the University of the Andes in Venezuela. His research interests are in 
sequential decision theory, most notably optimal stopping problems and bandit processes. 


HASSAN SEDAGHAT received his Ph.D. from the George Washington University, where 
he worked on semigroup compactifications. Nowadays, he is studying the global behavior of 
trajectories of nonlinear scalar and vector difference equations. He is also interested in the 
applications of such equations in economics, particularly, in the dynamical modelling of 
consumer demand without utility functions. His nonmathematical interests include reading 
classical fiction and watching adventure and science fiction movies. 


JET WIMP, a mathematician and a poet, teaches at Drexel University in Philadelphia. 


FRANK MORGAN works in minimal surfaces. He has a weekly live call-in Math Chat 
on local cable TV, featured in Ivars Peterson’s column MathLand at the MAA web 
site at http: //www.maa.org/(search Frank and Morgan:BY), and a biweekly Math 
Chat column in the Christian Science Monitor, sometimes available via the web page 
http: //www.csmonitor.com/. His books include Geometric Measure Theory: a Beginner’s 
Guide, Riemannian Geometry: a Beginner’s Guide, and Calculus Lite. 


1997] THE AUTHORS 387 


EDITOR’S ENDNOTES 


Michael Hardy sent the following comment after reading David Poole’s article The 
Stochastic Group [102 (1995) 798-801]: 


David Poole showed that the set of m  n nonsingular matrices in which the sum of the entries 
in each column is 1 is a group, and is isomorphic to the affine group that is often represented by 


A bi 


the group of matrices of the form . Poole’s group of “stochastic matrices” deserves to 


be called the “natural representation of the affine group.” An “affine combination” of n points 
Pi,--->P, in an affine space is a point c,p, + -*: +c,p,, where the scalars c,...,c,, satisfy 
Cc, + +++ +c, = 1. Consider n affinely independent points p,,...,p,, in an (m — 1)-dimensional 
affine space. Any affine automorphism y is determined by y(p,),..., y(p,,), and every point in 
the affine space can be written uniquely as an affine combination of p,,...,p,. Thus y(p;) = 
C1jP, + + +c,;p, for j = 1,...,2. The matrix C whose i, j entry is c,; represents this affine 
transformation. Identify any point d in the affine space with the column vector of its affine 
coefficients, writing d = d,p, + -: +d,p, = (d,,...,d,)’. The (column vector of affine coeffi- 
cients of the) image of d under y is y(d) = Cd. It follows that matrix multiplication corresponds 
to composition of affine transformations, so this really is a group representation. All of this 
works just as well regardless of the field of scalars. There is also a geometric, rather than 
algebraic, way to look at the argument showing that the two groups of matrices are conjugate to 
each other as subgroups of GL(n): Think about which (m — 1)-dimensional affine subspaces are 
invariant 


In response, Poole said that Hardy’s observations “nicely complement the main 
result of my article...I completely agree...that the representation...is the 
‘natural’ one for the affine group.” 

Two printer’s errors mar the displayed inequality in part (ii) of the Theorem at 
the top of page 804 in J. W. Sander’s article A Story of Binomial Coefficients and 
Primes [102 (1995) 802-807]. The correct inequality is 


E, ,(p*) < [7 - +) 'E (K), 


Paul Smith, writing about Arthur T. White’s article on Fabian Stedman [103 (1996) 
771-778], says 


White makes a convincing case for an implicit mastery of group theory by the composer of 
Stedman Doubles—a change (composition) for bell ringers. This is not the first time that insight 
was Offered in the MONTHLY, however. T. J. Fletcher, in Campanological Groups [63 (1956) 
619-626] observed: “One of the most pleasing methods from a mathematical point of view is 
Stedman’s Doubles. This method was invented round about 1640, and it displays a striking 
knowledge of decomposition into cosets very nearly one hundred years before Lagrange was 
born.” 


Alan Horwitz wrote 


The article A Hundred Years of Prime Numbers [103 (1996) 729-741] looks very interesting, but 
shouldn’t the MONTHLY have waited one year and celebrated 101 years of prime numbers? 


The editor regrets the misspelling of the name of the lyricist for the delightful “It’s 
De-Lovely” in last November’s issue [103 (1996) 770]. He is Ronald E. Prather, 
Trinity University. 

Roger A. Horn, Editor 


388 EDITOR’S ENDNOTES [April 


on to 


intr PROB ABILITY 


gecana Beesed EHO 
ec 


M. Grinstead 
e snell 


Charles 
}. Laurl 


Vertex Al 


for Begy 8ebras 


Tners 


Victor Kac 


American Mathematical Society 


Recently Published by the AMS 


Introduction to Probability 


Second Revised Edition 


Charles M. Grinstead, Swarthmore College, PA, 
and J. Laurie Snell, Dartmouth College, 
Hanover, NH 


This text is designed for an introductory proba- 
bility course at the university level for sopho- 
mores, juniors, and seniors in mathematics, the 
physical and social sciences, engineering, and 
computer science. It presents a thorough treat- 
ment of probability ideas and techniques nec- 
essary for a firm understanding of the subject. 


The text is also recommended for use in dis- 
crete probability courses. The material is orga- 
nized so that the discrete and continuous 
probability discussions are presented in a sep- 
arate, but parallel, manner. This organization 
doesn’t emphasize an overly rigorous or for- 
mal view of probabililty and therefore offers 
some strong pedagogical value. Hence, the 
discrete discussions can sometimes serve to 
motivate the more abstract continuous proba- 
bility discussions. 


Features: 


° Key ideas are developed in a somewhat 
leisurely style, providing a variety of inter- 
esting applications to probability and show- 
ing some nonintuitive ideas. 


¢ Over 600 exercises provide the opportunity 
for practicing skills and developing a sound 
understanding of ideas. 


¢ Text includes many computer programs that 
illustrate the algorithms or the methods of 
computation for important problems. 

1997; 484 pages; Hardcover; ISBN 0-8218-0749-8; 


List $49; All AMS members $39; Order code 
IPROBMM74 


Robert Steinberg 
Collected Papers 


Robert Steinberg, University of California, 
Los Angeles 


This volume contains all of Steinberg’s pub- 
lished papers on group theory, including those 
on “special representations” (now called 
Steinberg representations), tensor products of 
representations, finite reflection groups, regu- 
lar elements of algebraic groups, Galois coho- 
mology, universal extensions, etc. At the end 
of the book, there is a section called 
“Comments on the Papers”. The comments by 
Steinberg contain minor corrections and clarifi- 
cations and explain how ideas and results 
have evolved and been used since they first 
appeared. 

Collected Works, Volume 7; 1997; 598 pages; 


Hardcover; ISBN 0-8218-0576-2; List $79; Individual 
member $47; Order code CWORKS/7MM/74 


Techniques of 


Problem Solving 


Steven G. Krantz, Washington University, 
St. Louis, MO 


... the subject of problem solving ... is more than 
just a disconnected list of brain teasers and recre- 
ations. It is a way of life. Scientists of every stripe— 
chemists, physicists, psychologists, social engineers, 
and many others—ply their trade by considering a 
set of data, deciding what techniques are relevant to 
these data, and then solving a problem. It is this 
view of problem solving that will be promulgated in 
the present book. 

—from the Preface 
The purpose of this book is to teach the basic 
principles of problem solving, including both 
mathematical and nonmathematical problems. 
This book will help students to ... 


* translate verbal discussions into analytical 
data. 


e learn problem-solving methods for attacking 
collections of analytical questions or data. 


e build a personal arsenal of internalized 
problem-solving techniques and solutions. 


° become “armed problem solvers”, ready to 
do battle with a variety of puzzles in differ- 
ent areas of life. 


Taking a direct and practical approach to the 
subject matter, Krantz's book stands apart 
from others like it in that it incorporates exer- 
cises throughout the text. After many solved 
problems are given, a “Challenge Problem” is 
presented. Additional problems are included 
for readers to tackle at the end of each chapter. 
There are more than 350 problems in all. A 
Solutions Manual to most end-of-chapter exer- 
cises is available. 


1997; 465 pages; Softcover; ISBN 0-8218-0619-X; 
List $29; All AMS members $23; Order code TPSMM74 


Vertex Algebras for Beginners 


Victor Kac, Massachusetts Institute of 
Technology, Cambridge 


This book is an introduction to algebraic 
aspects of conformal field theory, which in the 
past decade revealed a variety of unusual 
mathematical notions. Vertex algebra theory 
provides an effective tool to study them in a 
unified way. 


Here, a mathematician will encounter new 
algebraic structures that originated from 
Einstein's special relativity postulate and 
Heisenberg's uncertainty principle. A physicist 
will find familiar notions presented in a more 
rigorous and systematic way, which may lead 
to a better understanding of foundations of 
quantum physics. 

University Lecture Series, Volume 10; 1997; 141 pages; 


Softcover; ISBN 0-8218-0643-2; List $25; 
All AMS members $20; Order code ULECT/10MM74 


All prices subject to change. Charges for delivery are $3.00 per order. For air delivery outside of the continental U. S., please include $6.50 per 
item. Prepayment required. Order from: American Mathematical Society, P. O. Box 5904, Boston, MA 02206-5904. For credit card orders, fax (401) 
331-3842 or call toll free 800-321-4AMS (4267) in the U. S. and Canada, (401) 455-4000 worldwide. Or place your order through the AMS book- 
store at http://www.ams.org/bookstore/. Residents of Canada, please include 7% GST. 


Lion Hunting 


and Other Mathematical Pursuits 


A Collection of Mathematics, Verse, and Stories 


by Ralph P. Boas, Jr. 


Gerald L. Alexanderson and 
Dale H. Mugler, Editors 


I highly recommend Lion Hunting and Other 
Mathematical Pursuits to high school mathematics 
clubs, mathematics teachers of all levels, and anyone 
interested in mathematics. Perhaps the most impor- 
tant features of this book is how it subtly makes the 
reader aware of the nature of mathematics. 


— The Mathematics Teacher 


As a young man at the Institute for Advanced Study in 
Princeton, Ralph Philip Boas, Jr., together with a group of 
other mathematicians, published a light-hearted article on 
the “mathematics of lion hunting” under a pseudonym 
(1938). This sparked a sequence of articles on the topic, 
several of which are drawn together in this book. 


Lion Hunting includes an assortment of articles that show 
the many facets of this remarkable mathematician, editor, 
writer, and teacher. Along with a variety of his lighter 
mathematical papers, the collection includes Boas’ verse 
and short stories, many of which are appearing for the first 
time. Anecdotes and recollections of his numerous experi- 
ences and of his work and meetings with many distin- 
guished mathematicians and scientists of his day are also 
included as well as photographs taken by Boas of Hardy, 
Littlewood, Besicovitch, Weil, and others. 


The mathematical articles in this collection cover a range 
of topics. They include articles on infinite series, the mean 
value theorem, indeterminate forms, complex variables, 
inverse functions, extremal problems for polynomials and 
more. A special section of this book is devoted to articles 
about the teaching of mathematics, with titles such as 


LION HUNTING & OTHER 
MATHEMATICAL PURSUITS 


A COLLECTION OF MATHEMATICS, 
VERSE, AND STORIES BY 
RALPH P. BOAS, JR. 


GERALO & ALEXANDERSON, 
DALE H. MUGLER. 
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“Calculus as an experimental science” and “Can we make 
mathematics intelligible?” 


Boas’s wit and playful humor are reflected in the verses 
included in this collection. The verses reflect the phases of 
his career as author, editor, teacher, department chair, and 
lover of literature. A section of the book describes the feud 
that Boas supposedly had with Bourbaki. Also included are 
many amusing anecdotes about famous mathematicians. 


320 pp., Paperbound, 1995, ISBN 0-88385-323-X 
List: $39.95 MAA Member: $28.95 
Catalog Code: DOL-15 


ORDER FROM: 
THE MATHEMATICAL ASSOCIATION OF AMERICA 
P.O. Box 91112, Washington, DC 20090-1112 


1-800-331-1622 


Address 
City 


State Zip 


(301) 617-7800 FAX (301) 206-9789 


QTY CATALOG CODE PRICE AMOUNT 
DOL-15 
TOTAL 
Payment CL] Check ( VISA UT MasterCard 
Credit Card No. Expires /_ 
Signature 
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Calculus Lite 


by Frank Morgan 


K Peters, Ltd. ; N97. 
89 Linden St. ISBN 1-56881-037-7 


llesley, MA 02181 1997, 299 pages, $34.00 
A concise and straightforward text which 


239-2404 allows teachers to apply the material to 
rs@tiac.net ; 
the needs of their classes. 
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Was Newton’s Calculus a Dead End? 
The Continental Influence of Maclaurin’s 
Treatise of Fluxions 


Judith V. Grabiner 


1. INTRODUCTION. Eighteenth-century Scotland was an internationally-recog- 
nized center of knowledge, “a modern Athens in the eyes of an enlightened world.” 
[74, p. 40] [81] The importance of science, of the city of Edinburgh, and of the 
universities in the Scottish Enlightenment has often been recounted. Yet a key 
figure, Colin Maclaurin (1698-1746), has not been highly rated. It has become a 
commonplace not only that Maclaurin did little to advance the calculus, but that 
he did much to retard mathematics in Britain—although he had (fortunately) no 
influence on the Continent. Standard histories have viewed Maclaurin’s major 
mathematical work, the two-volume Treatise of Fluxions of 1742, as an unread 
monument to ancient geometry and as a roadblock to progress in analysis. 
Nowadays, few people read the Treatise of Fluxions. Much of the literature on the 
history of the calculus in the eighteenth and nineteenth centuries implies that few 
people read it in 1742 either, and that it marked the end—the dead end—of the 
Newtonian tradition in calculus. [9, p. 235], [49, p. 429], [10, p. 187], [11, pp. 228-9], 
[43, pp. 246-7], [42, p. 78], [64, p. 144] 

But can this all be true? Could nobody on the Continent have cared to read the 
major work of the leading mathematician in eighteenth-century Scotland? Or, 
if the work was read, could it truly have been “of little use for the researcher” 
[42, p. 78] and have had “no influence on the development of mathematics”? 
[64, p. 144] 

We will show that Maclaurin’s Treatise of Fluxions did develop important ideas 
and techniques and that it did influence the mainstream of mathematics. The 
Newtonian tradition in calculus did not come to an end in Maclaurin’s Britain. 
Instead, Maclaurin’s Treatise served to transmit Newtonian ideas in calculus, 
improved and expanded, to the Continent. We will look at what these ideas were, 
what Maclaurin did with them, and what happened to this work afterwards. Then, 
we will ask what by then should be an interesting question: why has Maclaurin’s 
role been so consistently underrated? These questions will involve general matters 
of history and historical writing as well as the development of mathematics, and 
will illustrate the inseparability of the external and internal approaches in under- 
standing the history of science. 


2. THE STANDARD PICTURE. Let us begin by reviewing the standard story 
about Maclaurin and his Treatise of Fluxions. The calculus was invented indepen- 
dently by Newton and Leibniz in the late seventeenth century. Newton and Leibniz 
developed general concepts—differential and integral for Leibniz, fluxion and 
fluent for Newton—and devised notation that made it easy to use these concepts. 
Also, they found and proved what we now call the Fundamental Theorem of 
Calculus, which related the two main concepts. Last but not least, they successfully 
applied their ideas and techniques to a wide range of important problems. 
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[9, p. 299] It was not until the nineteenth century, however, that the basic concepts 
were given a rigorous foundation. 

In 1734 George Berkeley, later Bishop of Cloyne, attacked the logical validity of 
the calculus as part of his general assault on Newtonianism. [12, p. 213] Berkeley’s 
criticisms of the rigor of the calculus were witty, unkind, and—with respect to the 
mathematical practices he was criticizing —essentially correct. [6, v. 4, pp. 65-102] 
[38, pp. 33-34] [82, pp. 332-338] Maclaurin’s Treatise was supposedly intended to 
refute Berkeley by showing that Newton’s calculus was rigorous because it could be 
reduced to the methods of Greek geometry. [10, pp. 181-2, 187] [9, pp. 233, 235] 
Maclaurin himself said in this preface that he began the book to answer Berkeley’s 
attack, [63, p. i] and also to rebut Berkeley’s accusation that mathematicians were 
hostile to religion. [78, p. 50] 

The majority of Maclaurin’s treatise is contained in its first Book, which is called 
“The Elements of the Method of Fluxions, Demonstrated after the Manner of the 
Ancient Geometricians.” That title certainly sounds as though it looks backward to 
the Greeks, not forward to modern analysis. And the text is full of words—lots of 
words. So much time is spent on preliminaries that it is not until page 162 that 
he can show that the fluxion of ay is a times the fluxion of y. Florian Cajori, 
whose writings have helped spread the standard story, compared Maclaurin to the 
German poet Klopstock who, Cajori said, was praised by all, read by none. 
[10, p. 188] While British mathematicians, bogged down with geometric baggage, 
studied and revered the work and notation of Newton and argued with Berkeley 
over foundations, Continental mathematicians went onward and upward analyti- 
cally with the calculus of Leibniz. The powerful analytic results and techniques in 
eighteenth-century Continental mathematics were all that mathematicians like 
Cauchy, Riemann, and Weierstrass needed for their nineteenth-century analysis 
with its even greater power, together with its improved rigor and generality. 
[9, ch. 7] [49, p. 948] This story became so well known that it was cited by the 
literary critic Matthew Arnold, who wrote, “The man of genius [Newton] was 
continued by...completely powerless and obscure followers.... The man of 
intelligence [Leibniz] was continued by successors like Bernoulli, Euler, Lagrange, 
and Laplace—the greatest names in modern mathematics.” [1, p. 54; cited by 
[61, p. 15]] 

Now since I myself have contributed to the standard story, especially in 
delineating the links among Euler, Lagrange, and Cauchy, [38, chs. 3-6] I have a 
good deal of sympathy for it, but I now think that it must be modified. Maclaurin’s 
Treatise of Fluxions is an important link between the calculus of Newton and 
Continental dnalysis, and Maclaurin contributed to key developments in the 
mathematics of his contemporaries. Let us examine the evidence for this state- 
ment. 


3. THE NATURE OF MACLAURIN’S TREATISE OF FLUXIONS. Why—the 
standard story notwithstanding—might Maclaurin’s Treatise of Fluxions have been 
able to transmit Newtonian calculus, improved and expanded, to the Continent? 
First, because the Treatise of Fluxions is not just one “Book,” but two. While Book 
I is largely, though not entirely, geometric, Book II has a different agenda. Its title 
is “On the Computations in the Method of Fluxions.” [my italics] Maclaurin began 
Book II by championing the power of symbolic notation in mathematics. [63, 
pp. 575-576] He explained, as Leibniz before him and Lagrange after him would 
agree, that the usefulness of symbolic notation arises from its generality. So, 
Maclaurin continued, it is important to demonstrate the rules of fluxions once 
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again, this time from a more algebraic point of view. Maclaurin’s appreciation of 
the algorithmic power of algebraic and calculus notation expresses a common 
eighteenth-century theme, one developed further by Euler and Lagrange in their 
pursuit of pure analysis detached from any kind of geometric intuition. To be sure, 
Maclaurin, unlike Euler and Lagrange, did not wish to detach the calculus from 
geometry. Nonetheless, Maclaurin’s second Book in fact, as well as in rhetoric, has 
an algorithmic character, and most of its results may be read independently of 
their geometric underpinnings, even if Maclaurin did not so intend. (In his Preface 
to Book I, he even urged readers to look at Book II before the harder parts of 
Book I.) [63, p iii] The Treatise of Fluxions, then, was not foreign to the Continental 
point of view, and may have been written in part with a Continental audience in 
mind. 

Nor was this algebraic character a secret open only to the reader of English. 

There was a French translation in 1749 by the Jesuit R. P. Pézénas, including an 
extensive table of contents. [62] Lagrange, among others, seems to have used this 
French edition (since he cited it by the French title [58, p. 17] though he cited 
other English works in English [58, p. 18]). Pézénas’ translation, moreover, was 
neither isolated nor idiosyncratic, but part of the activity of a network of Jesuits 
inter- 
ested in mathematics and mathematical physics, especially work in English, with 
Maclaurin one of the authors of interest to them. [84, pp. 33, 221, 278, 517, 655] 
For instance, Pézénas himself translated other English works, including those by 
Desaguliers, Gardiner’s logarithmic tables, and Seth Ward’s Young Mathematician’s 
Guide [83, pp. 571-2] Thus there was a well-worn path connecting English- 
language work with interested Continental readers. Furthermore, the two-fold 
character of the Treaties of Fluxions was noted, with special praise for Book II’s 
treatment of series, by Silvestre-Francois Lacroix in the historical introduction to 
the second edition of his highly influential three-volume calculus textbook. 
[52, p. xxvii] Unfortunately, though, recognition of the two-fold character has been 
absent from the literature almost completely from Lacroix’s time until the recent 
work by Sageng and Guicciardini. [42] [78] We shall address the reasons for this 
neglect in due course. 
4. THE SOCIAL CONTEXT: THE SCOTTISH ENLIGHTENMENT. Another 
reason for doubting the standard picture comes from the social context of Maclau- 
rin’s career. Eighteenth-century Scotland, Maclaurin’s home, was anything but an 
intellectual backwater. It was full of first-rate thinkers who energetically pursued 
science and philosophy and whose work was known and respected throughout 
Europe. One would expect Scotland’s leading mathematician to share these con- 
nections and this international renown, and he did. 

Although Scotland had been deprived of its independent national government 
by the Act of Union of 1707, it still retained, besides its independent legal system 
and its prevailing religion, its own educational system. The strength and energy of 
Scottish higher education in Maclaurin’s time is owed in large part to the Scottish 
ruling classes, landowners and merchants alike, who saw science, mathematics, and 
philosophy as keys to what they called the “improvement” of their yet underdevel- 
oped nation. [65, p. 254] [80, pp. 7-8, 10-11] [17, pp. 127, 132-3] Eighteenth- 
century Scotland, with one-tenth the population of England, had four major 
universities to England’s two. [80, p. 116] Maclaurin, when he wrote the Treatise of 
Fluxions, was Professor of Mathematics at the University of Edinburgh. Edinburgh 
was about to become the heart of the Scottish Enlightenment, and Maclaurin until 
his death in 1746 was a leading figure in that city’s cultural life. 
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Mathematics played a major role in the Scottish university curriculum. This was 
in part for engineers; Scottish military engineers were highly in demand even on 
the Continent. [17, p. 125] Maclaurin himself was actively interested in the 
applications of mathematics, and just before his untimely death had planned to 
write a book on the subject. [36] [68, p. xix] In addition, mathematics and 
Newtonian physics were part of the course of study for prospective clergyman. 
[80, p. 20] The influential “Moderate” party in the Church of Scotland appreciated 
the Newtonian reconciliation of science and religion. [16, pp. 53, 57] 

Maclaurin’s position in Edinburgh’s cultural life was not just that of a technically 
competent mathematician. For instance, he was part of the Rankenian society, 
which met at Ranken’s Tavern in Edinburgh to discuss such things as the philoso- 
phy of Bishop Berkeley; the society introduced Berkeley’s philosophy to the 
Scottish university curriculum. [24, p. 222] [17, p. 133] [65, p. 197] Maclaurin and 
his physician friend Alexander Monro were the founders and moving spirits of the 
Edinburgh Philosophical Society. [65, p. 198] With Newton’s encouragement, 
Maclaurin had become the chief spokesman in Scotland for the new Newtonian 
physics. His posthumously published book, An Account of Sir Isaac Newton’s 
Philosophical Discoveries, was based on material Maclaurin used in his classes at 
Edinburgh, and the book was of great interest to philosophers. [24, p. 137] That 
book became well known on the Continent. It was translated into French almost as 
soon as it appeared, by Louis-Anne Lavirotte in 1749, and the first part appeared 
in Italian in Venice in 1762. 

Anoher branch of Scottish science, namely medicine, also had many links with 
the Continent and was highly regarded there. Medical students went back and 
forth between Scotland, Holland, and France. [17, p. 135] [80, p. 7] 

The best-known figures of eighteenth-century Scotland had major interactions 
with, and influence upon, Continental science and philosophy. [39] [81] Let it 
suffice to mention the names of four: the philosopher David Hume, who was a 
student at Edinburgh in Maclaurin’s time; the geologist James Hutton, who 
attended and admired Maclaurin’s lectures; [34, pp. 577-8] and, a bit after 
Maclaurin’s time but still subject to his influence on Scottish higher education, the 
chemist Joseph Black and the economic and political philosopher Adam Smith. 
Maclaurin himself had twice won prizes from the Académie des Sciences in Paris, 
once in 1724 for a memoir on percussion, and then in 1740 (dividing the prize with 
Daniel Bernoulli, P. Antoine Cavalleri, and Leonhard Euler) for a memoir on the 
tides. [79, p. 611] [39, pp. 400-401] 

Scotland in the eighteenth century nurtured first-rate intellectual work on 
mathematics, philosophy, science, medicine, and engineering, and did it all as part 
of a general European culture. [39, p. 412] [81, passim] The Treatise of Fluxions was 
the major mathematical work of a Scottish mathematician of considerable reputa- 
tion on the Continent, a major work philosophically attuned to the enormously 
influential Newtonian physics and the Continentally popular algebraic symbolism. 
Such a work would certainly be of interest to Continental thinkers. Social consider- 
ations may not suffice to determine mathematical ideas, but they certainly affect 
the mathematician’s ability to make a living, to get research support, and to 
promote contact and communication with other mathematicians and scientists at 
home and abroad. And so it was with Maclaurin. 


5. MACLAURIN’S CONTINENTAL REPUTATION. An even better reason for 
not accepting the traditional view of Maclaurin is that his work demonstrably was 
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read in the eighteenth century, and was read by the big names of Continental 
mathematics. He had a Continental acquaintance through travel and correspon- 
dence. Even before the Treatise of Fluxions, his reputation had been enhanced by 
his Académie prizes and by his books on geometry. He was thus a respected 
member of an international network of mathematicians with interests in a wide 
range of subjects, and the publication of the Treatise of Fluxions was eagerly 
anticipated on the Continent. 

The Treatise of Fluxions of 1742 was Maclaurin’s major work on analysis, 
incorporating and somewhat dwarfing what he had done earlier. It contains an 
exposition of the calculus, with old results explained and many new results 
introduced and proved. Maclaurin seems to have included almost everything he 
had done in analysis and its applications to Newtonian physics. In particular, the 
findings of his Paris prize paper on the tides were included and expanded. His 
other papers, the posthumous and relatively elementary Algebra, and his works on 
geometry as such—though highly regarded—do not concern us here, but his 
Continental reputation was enhanced by these as well. 

Let us turn now to some specific evidence for the Continental reputation of 
Maclaurin’s major work. In 1741, Euler wrote to Clairaut that, though he had not 
yet seen the Paris prize papers on the tides, “from Mr. Maclaurin I expect only 
excellent ideas.” [47, p. 87] Euler added that he had heard from England (pre- 
sumably from his correspondent James Stirling) that Maclaurin was bringing out a 
book on “differential calculus,” and asked Clairaut to keep him posted about this. 
In turn, Clairaut asked Maclaurin later in 1741 about his plans for the book, 
[66, p. 348] which Clairaut wanted to see before publishing his own work on the 
shape of the earth. [47, p. 110] Euler did get the Treatise of Fluxions, and read 
enough of it quickly to praise it in a letter to Goldbach in 1743. [48, p. 179] Jean 
d’Alembert, in his Traité de dynamique of 1743, [22, sec. 37, n.] praised the rigor 
brought to calculus by the Treatise of Fluxions. D’Alembert’s most recent biogra- 
pher, Thomas Hankins, argues that Maclaurin’s Treatise, appearing at this time, 
helped persuade d’Alembert that gravity could best be described as a continuous 
acceleration rather than a series of infinitesimal leaps. [44, p. 167] D’Alembert’s 
general approach to the foundations of the calculus in terms of limits clearly was 
influenced by Newton’s and Maclaurin’s championing of limits over infinitesimals, 
in particular by Maclaurin’s clear description of limits in one of the parts of his 
Treatise of Fluxions that explicitly responds to Berkeley’s objections (and which 
incidentally may be the first explicit description of the tangent as the limit of 
secant lines; see Section 7). [44, p. 23] [63, pp. 422-3] Lagrange in his Analytical 
Mechanics (55, p.:243] said that Maclaurin, in the Treatise of Fluxions, was the first 
to treat Newton’s laws of motion in the language of the calculus in a coordinate 
system fixed in space. Though C. Truesdell [80, pp. 250-3] has shown that 
Lagrange was wrong because Johann Bernoulli and Euler were ahead of Maclaurin 
on this, the fact that Lagrange believed this is one more piece of evidence for the 
Continental reputation of Maclaurin as mathematician and physicist. 


6. MACLAURIN’S MATHEMATICS AND ITS IMPORTANCE. The previous 
points show that Maclaurin could have been influential, but not that he was. Five 
examples will reveal both the nature of Maclaurin’s techniques and the scope of his 
influence: a special case of the Fundamental Theorem of Calculus; Maclaurin’s 
treatment of maxima and minima for functions of one variable; the attraction of 
spheroids; what is now called the Euler-Maclaurin summation formula; and elliptic 
integrals. 
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a. Key Methods in the Calculus. Two methods were central to the study of 
real-variable calculus in the eighteenth and nineteenth centuries. One of these is 
studying real-valued functions by means of power-series representations. This 
tradition is normally thought first to flower with Euler; it is then most closely 
associated with Lagrange, and, later for complex variables, with Weierstrass. The 
second such method is that of basing the foundations of the calculus on the algebra 
of inequalities—what we now call delta-epsilon proof techniques—and using 
algebraic inequalities to prove the major results of the calculus; this tradition is 
most closely associated with the work of Cauchy in the 1820’s. I have traced these 
traditions back to Lagrange and Euler in my work on the origins of Cauchy’s 
calculus. [38, chs. 3-6] It is surprising, at least if one accepts the standard picture 
of the history of the calculus, that both of these methods—studying functions by 
power series, basing foundations on inequalities—were materially advanced by 
Maclaurin in the Treatise of Fluxions. It is especially striking that the importance of 
Maclaurin’s work on series—work based, it is well to remember, on Newton’s use 
of infinite series—was recognized and praised in 1810 by Lacroix, who also linked 
it with the series-based calculus of Lagrange. [52, p. xxxiii] 

Maclaurin skillfully used algebraic inequalities in his proof of a special case of 
the Fundamental Theorem of Calculus. He showed, for a particular function, that 
if one takes the fluxion of the area under the curve whose equation is y = f(x), 
one gets the function f(x). In his proof, Maclaurin adapted the intuition underly- 
ing Newton’s argument for this fact in De Analysi [69]—that the rate of change of 
the area under a curve is measured by the height of the curve—but Maclaurin’s 
proof is more rigorous. Although Maclaurin’s argument proceeds algebraically, the 
concepts involved resemble those of the Greek “method of exhaustion” (more 
precisely termed by Dijksterhuis “indirect passage to the limit’). [26, p. 130] A key 
step in this Greek work is first to assume that two equal areas or expressions for 
areas are unequal, and then to argue to a contradiction by using inequalities that 
hold among various rectilinear areas. Newton in the Principia had based proofs of 
new results about areas and curves on methods akin to those of the Greeks. 
Maclaurin carried this much further. It was Maclaurin’s “conservative” allegiance 
to Archimedean geometric methods that led him to buttress the kinematic intuition 
of Newton’s calculus with algebraic inequality proofs. 

What Maclaurin proved in the example under discussion is that, if the area 
under a curve up to x is given by x”, the ordinate of the curve must be y = nx""'," 
which is known to be the fluxion of x”. [63, pp. 752-754] Maclaurin’s diagram for 
this is much like the one Newton gave in the DeAnalysi. [69, pp. 3-4] Maclaurin 


Figure 1 
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began by saying that, since x and y increase together, the following inequality 
holds between the areas shown: 


x" —(x—h)" <yh <(x+h)* —x". (1) 


(Maclaurin gave this inequality verbally; I have supplied the “ < ” signs; also, I use 
“h” for the increment where Maclaurin used “o”.) Now Maclaurin recalled an 
algebraic identity he had proved earlier: [63, p. 583; inequality notation added] 


If E < F, then nF""!(E — F) < E” —F" <nE"""(E—-F). (2) 


(It may strike the modern reader that, since nx"~' is the derivative of x”, this 
second inequality is a special case of the mean-value theorem for derivatives. I 
shall return to this point later.) 

Now, letting x — h play the role of F and x play the role of LE, E — F is h and 
the first inequality in (2) yields 


n(x—h)" 'h<x"-—(x—-h)". 


Similarly, if F =x and EF =x +h, then EF — F =/h and the second inequality in 
(2) becomes 


(x+h)" —x" <n(xt+h)*h. 
Combining these with inequality (1) about the areas, Maclaurin obtained 
n(x —h)" 'h<yh <n(x+h)""'h. 
Dividing by 4 produces 
n(x—h)"' <y<n(x+h)"". (3) 


Recall that, given that the area was x”, Maclaurin was seeking an expression for y, 
the fluxion of that area. A modern reader, having reached the inequality (3), might 
stop, perhaps saying “let h go to zero, so that y becomes nx”~',” or perhaps 
justifying the conclusion by appealing to the delta-epsilon characterization of limit. 
What Maclaurin did instead was what Archimedes might have done, a double 
reductio ad absurdum. But what Archimedes might have done geometrically and 
verbally, Maclaurin did algebraically. He assumed first that y is not equal to 
nx"~', Then, he said, it must be equal to nx”"'+r for some r. First, he 
considered the case when this r was positive. This will lead to a contradiction if h 
is chosen so that y=n(x+h)""', since, he observed, inequality (3) will be 
violated when h = (x""' + r/n)'/““"—. Similarly, he calculated the h that pro- 
duces a contradiction when r is assumed to be negative. Thus there can be no such 
r,and y = nx" '. [63, p. 753] 

Maclaurin introduced this proof by saying something surprising for a Treatise of 
Fluxions: that the use of the inequalities makes the demonstration of the value of y 
“independent of the notion of a fluxion.” [63, p. 752] (Of course one would need 
the notion of fluxion to interpret y as the fluxion of the area function x”, but the 
proof itself is algebraic.) This proof was presumably part of his agenda in writing 
the more algebraic Book II of the Treatise for an audience on the Continent, where 
fluxions were suspect as involving the idea of motion. Later Lagrange, in seeking 
his purely algebraic foundation for the calculus, explicitly said he wanted to free 
the calculus from fluxions and what he called the “foreign idea” of motion. It is 
thus striking that Lagrange’s Théorie des fonctions analytiques (1797) gives a more 
general version of the kind of argument Maclaurin had given, applying to any 
increasing function that satisfies the geometric inequality expressed in (1). In 
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place of the algebraic inequality (2), Lagrange used the mean-value theorem. 
[58, pp. 238-9] [38, pp. 156-158] The similarity of the two arguments does not 
prove influence, of course, but it certainly demonstrates that Maclaurin’s work, 
which we know Lagrange read (e.g., [58, p. 17]), uses the algebra of inequalities in 
a way consistent with that used by Lagrange and his successors. 

Maclaurin’s argument exemplifies the way his Treatise reconciles the old and the 
new. The double reductio ad absurdum reflects his Archimedean agenda. Treating 
the area as generated by a moving vertical line, and then searching for the 
relationship between the area and its fluxion, are Newtonian. Maclaurin did not 
have a general proof of the Fundamental Theorem in this argument, but relied on 
an inequality based on the specific properties of a specific function. Nonetheless, 
he had the precise bounding inequalities for the area function used later by 
Lagrange, and he used an algebraic inequality proof in a manner that would not 
disgrace a nineteenth-century analyst. 

Inequality-based arguments in the calculus as used by Lagrange and Cauchy 
owe a lot to the eighteenth-century study of algebraic approximations, and it once 
seemed to me that this was their origin. But the algebra of inequalities as used in 
Continental analysis, especially in d’Alembert’s pioneering treatment of the tan- 
gent as the limit of secants in the article “Différentiel’ in the Encyclopédie, [19] 
must owe something also to Maclaurin’s translation of Archimedean geometry into 
algebraic dress to justify results in calculus. Throughout the eighteenth century, 
practitioners of the limit tradition on the Continent use inequalities; a clear line of 
influence connects Maclaurin’s admirer d’Alembert, Simon L’Huilier (who was a 
foreign member of the Royal Society), the textbook treatment of limits by Lacroix, 
and, finally, Cauchy. [38, pp. 80-87] 

Now let us turn to some of Maclaurin’s work on series. There is, of course, the 
Maclaurin series, that is, the Taylor series expanded around zero. This result 
Maclaurin himself credited to Taylor, and it was known earlier to Newton and 
Gregory. It was called the Maclaurin series by John F. W. Herschel, Charles 
Babbage, and George Peacock in 1816 [51, pp. 620-21] and by Cauchy in 1823. [14, 
p. 257] Since it was obvious that Maclaurin had not invented it, the attribution 
shows appreciation by these later mathematicians for the way Maclaurin used the 
series to study functions. A key application is Maclaurin’s characterization of 
maxima, minima, and points of inflection of an infinitely differentiable function by 
means of its successive derivatives. When the first derivative at a point is zero, 
there is a maximum if the second derivative is negative there, a minimum if it is 
positive. If the second derivative is also zero, one looks at higher derivatives to tell 
whether the point:is a maximum, minimum, or point of inflection. These results 
can be proved by looking at the Taylor series of the function near the point in 
question, and arguing on the basis of the inequalities expressed in the definition of 
maximum and minimum. For instance (in modern [Lagrangian] notation), if f(x) is 
a maximum, then 

f(x) >f(x th) =f(x) +hf'(x) + h772!f’ (x) ++, and (4) 
f(x) > f(x —h) = f(x) — hf'(x) + h’72if"(x) — 
if h is small. If the derivatives are bounded, and if / is taken sufficiently small so 
that the term in h dominates the rest, the inequalities (4) can both hold only if 
f'(x) = 0. If f’(x) = 0, then the h? term dominates, and the inequalities (4) hold 
only if f”(x) is negative. And so on. 

I have traced Cauchy’s use of this technique back to Lagrange, and from 

Lagrange back to Euler. [38, pp. 117-118] [37, pp. 157-159] [58, pp. 235-6] [29, 
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Secs. 253-254] But this technique is explicitly worked out in Maclaurin’s Treatise of 
Fluxions. Indeed, it appears twice: once in geometric dress in Book I, Chapter IX, 
and then more algebraically in Book II. [63, pp. 694-696] Euler, in the version he 
gave in his 1755 textbook, [20] does not refer to Maclaurin on this point, but then 
he makes few references in that book at all. Still we might suspect, especially 
knowing that Stirling told Euler in a letter of 16 April 1738 [91] that Maclaurin had 
some interesting results on series, that Euler would have been particularly inter- 
ested in looking at Maclaurin’s applications of the Taylor series. Certainly Lacroix’s 
praise for Maclaurin’s work on series must have taken this set of results into 
account. [52, p. xxvii] Even more important, Lagrange, in unpublished lectures on 
the calculus from Turin in the 1750's, after giving a very elementary treatment of 
maxima and minima, referred to volume II of Maclaurin’s Treatise of Fluxions as 
the chief source for more information on the subject. [7, p. 154] Since Lagrange did 
not mention Euler in this connection at all, Lagrange could well have not even 
have seen the IJnstitutiones calculi differentialis of 1755 when he made this refer- 
ence. This Taylor-series approach to maxima and minima (with the Lagrange 
remainder supplied for the Taylor series) plays a major role in the work of 
Lagrange, and later in the work of Cauchy. It is because Maclaurin thought of 
maxima and minima, and of convexity and concavity, in Archimedean geometrical 
terms that he was led to look at the relevant inequalities, just as the geometry of 
Archimedes helped Maclaurin formulate some of the inequalities he used to prove 
his special case of the Fundamental Theorem of Calculus. 


b. Ellipsoids. We now turn to work in applied mathematics that constitutes one of 
Maclaurin’s great claims to fame: the gravitational attraction of ellipsoids and the 
related problem of the shape of the earth. Maclaurin is still often regarded as the 
creator of the subject of attraction of ellipsoids. [85, pp. 175, 374] In the eighteenth 
century, the topic attracted serious work from d’Alembert, A.-C. Clairaut, Euler, 
Laplace, Lagrange, Legendre, Poisson, and Gauss. In the twentieth century, 
Subramanyan Chandrasekhar (later Nobel laureate in physics) devoted an entire 
chapter of his classic Ellipsoidal Figures of Equilibrium to the study of Maclaurin 
spheroids (figures that arise when homogeneous bodies rotate with uniform 
angular velocity), the conditions of stability of these spheroids and their harmonic 
modes of oscillation, and their status as limiting cases of more general figures of 
equilibrium. Such spheroids are part of the modern study of classical dynamics in 
the work of scientists like Chandrasekhar, Laurence Rossner, Carl Rosenkilde, and 
Norman Lebovitz. [15, pp. 77-100] Already in 1740 Maclaurin had given a 
“rigorously exatt, geometrical theory” of homogeneous ellipsoids subject to in- 
verse-Square gravitational forces, and had shown that an oblate spheroid is a 
possible figure of equilibrium under Newtonian mutual gravitation, a result with 
obvious relevance for the shape of the earth. [39, p. 172] [86, p. xix] [85, p. 374] 
Of particular importance was Maclaurin’s decisive influence on Clairaut. 
Maclaurin and Clairaut corresponded extensively, and Clairaut’s seminal 1743 
book La Figure de la Terre [18] frequently, explicitly, and substantively cites his 
debts to Maclaurin’s work. [39, pp. 590-597] A key result, that the attractions of 
two confocal ellipsoids at a point external to both are proportional to their masses 
and are in the same direction, was attributed to Maclaurin by d’Alembert, an 
attribution repeated by Laplace, Lagrange, and Legendre, then by Gauss, who 
went back to Maclaurin’s original paper, and finally by Lord Kelvin, who called it 
“Maclaurin’s splendid theorem.” [15, p. 38] [85, pp. 145, 409] Lagrange began his 
own memoir on the attraction of ellipsoids by praising Maclaurin’s treatment in the 
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prize paper of 1740 as a masterwork of geometry, comparing the beauty and 
ingenuity of Maclaurin’s work with that of Archimedes, [57, p. 619] though 
Lagrange, typically, then treated the problem analytically. Maclaurin’s eighteenth- 
and nineteenth-century successors also credit him with some of the key methods 
used in studying the equilibrium of fluids, such as the method of balancing 
columns. [39, p. 597] Maclaurin’s work on the attraction of ellipsoids shows how his 
geometric insights fruitfully influenced a subject that later became an analytic one. 


c. The Euler-Maclaurin Formula. The Euler-Maclaurin formula expresses the 
value of definite integrals by means of infinite series whose coefficients involve 
what are now called the Bernoulli numbers. The formula shows how to use 
integrals to find the partial sums of series. Maclaurin’s version, in modern notation, 
is: 


y F(a +h) = [ F(x) de + 1/2F(a) + 1/2F'(a) 
h=0 0 


~ 1/720F"(a) + 1/30240F(a) — + 
[35, pp. 84-86 | 


James Stirling in 1738, congratulating Euler on his publication of that formula, told 
Euler that Maclaurin had already made it public in the first part of the Treatise of 
Fluxions, which was printed and circulating in Great Britain in 1737. [47, p. 88n] 
[91, p. 178] (On this early publication, see also [63, pp. iii, 691n]). P. L. Griffiths has 
argued that this simultaneous discovery rests on De Moivre’s work on summing 
reciprocals, which also involves the so-called Bernoulli numbers. [40] [41, pp. 
16-17] [25, p. 19] In any case, Euler and Maclaurin derived the Euler-Maclaurin 
formula in essentially the same way, from a similar geometric diagram and then by 
integrating various Taylor series and performing appropriate substitutions to find 
the coefficients. [31] [32] [33] Maclaurin’s approach is no more Archimedean or 
geometric than Euler’s; they are similar and independent. [63, pp. 289-293, 
672-675] [35, pp. 84-93] [67] In subsequent work, Euler went on to extend and 
apply the formula further to many other series, especially in his Introductio in 
analysin infinitorum of 1748 and Institutiones calculi differentialis of 1755. [35, 
p. 127] But Maclaurin, like Euler, had applied the formula to solve many problems. 
[63, pp. 676-693] For instance, Maclaurin used it to sum powers of arithmetic 
progressions and to derive Stirling’s formula for factorials. He also derived what is 
now called the Newton-Cotes numerical integration formula, and obtained what is 
now called Simpson’s rule as a special case. It is possible that his work helped 
stimulate Euler’s later, fuller investigations of these important ideas. 

In 1772, Lagrange generalized the Euler-Maclaurin formula, which he obtained 
as a consequence of his new calculus of operators. [53] [35, pp. 169, 261] In 1834, 
Jacobi provided the formula with its remainder term, [46, pp. 263, 265] in the same 
paper in which he first introduced what are now called the Bernoulli polynomials. 
Jacobi, who called the result simply the Maclaurin summation formula, cited it 
directly from the Treatise of Fluxions. [46, p. 263] Later, Karl Pearson used the 
formula as an important tool in his statistical work, especially in analyzing 
frequency curves. [72, pp. 217, 262] 

The Euler-Maclaurin formula, then, is an important result in the mainstream of 
mathematics, with many applications, for which Maclaurin, both in the eighteenth 
century and later on, has rightly shared the credit. 
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d. Elliptic Integrals. Some integrals (Maclaurin used the Newtonian term 
“fluents”), are algebraic functions, Maclaurin observed. Others are not, but some 
of these can be reduced to finding circular arcs, others to finding logarithms. By 
analogy, Maclaurin suggested, perhaps a large class of integrals could be studied by 
being reduced to finding the length of an elliptical or hyperbolic arc. [63, p. 652] By 
means of clever geometric transformations, Maclaurin was able to reduce the 
integral that represented the length of a hyperbolic arc to a ‘nice’ form. Then, by 
algebraic manipulation, he could reduce some previously intractable integrals to 
that same form. His work was translated into analysis by d’Alembert and then 
generalized by Euler. [13, p. 846] [23] [27, p. 526] [28, p. 258] In 1764, Euler found a 
much more elegant, general, and analytic version of this approach, and worked out 
many more examples, but cited the work of Maclaurin and d’Alembert as the 
source of his investigation. A.-M. Legendre, the key figure in the eighteenth-cen- 
tury history of elliptic integrals, credited Euler with seeing that, by the aid of a 
good notation, arcs of ellipses and other transcendental curves could be as 
generally used in integration as circular and logarithmic arcs. [45, p. 139] Legendre 
was, of course, right that “elliptic integrals” encompass a wide range of examples; 
this was exactly Maclaurin’s point. Thus, although his successors accomplished 
more, Maclaurin helped initiate a very important investigation and was the first to 
appreciate its generality. Maclaurin’s geometric insight, applied to a problem in 
analysis, again brought him to a discovery. 


7. OTHER EXAMPLES OF MACLAURIN’S MATHEMATICAL INFLUENCE. 
The foregoing examples provide evidence of direct influence of the Treatise of 
Fluxions on Continental mathematics. There is much more. For instance, Lacroix, 
in his treatment of integrals by the method of partial fractions, called it “the 
method of Maclaurin, followed by Euler.” [52, Vol. II, p. 10] [63, pp. 634-644] Of 
interest too is Maclaurin’s clear understanding of the use of limits in founding the 
calculus, especially in the light of his likely influence on d’Alembert’s treatment of 
the foundations of the calculus by means of limits in the Encyclopédie, which in 
turn influenced the subsequent use of limits by L’Huilier, Lacroix, and Cauchy, [38, 
chapter 3] (and on Lagrange’s acceptance of the limit approach in his early work in 
the 1750’s). [7] Although the largest part of Maclaurin’s reply to Berkeley was the 
extensive proof of results in calculus using Greek methods, he was willing to 
explain important concepts using limits also. In particular, Maclaurin wrote, “As 
the tangent of an arch [arc] is the right line that limits the position of all the 
secants that can pass through the point of contact... though strictly speaking it be 
no secant; so a ratio may limit the variable ratios of the increments, though it 
cannot be said to be, the ratio of any real increments.” [63, p. 423] Maclaurin’s 
statement answers Berkeley’s chief objection—that the increment in a function’s 
value is first treated as non-zero, then as zero, when one calculates the limit of the 
ratio of increments or finds the tangent to a curve. Maclaurin’s statement is in the 
tradition of Newton’s Principia (Book I, Scholium to Lemma XI), but is in a 
form much closer to the later work of d’Alembert on secants and tangents. [20] 
Maclaurin pointed out that most of the propositions of the calculus that he could 
prove by means of geometry “may be briefly demonstrated by this method [of 
limits].” [63, p. 87, my italics] 

In addition, Maclaurin had considerable influence in Britain, on mathemati- 
cians like John Landen (whose work on series was praised by Lagrange), Robert 
Woodhouse (who sparked the new British interest in Continental work about 
1800), and on Edward Waring and Thomas Simpson, whose names are attached to 
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results well known today. [42] Going beyond the calculus, Maclaurin’s purely 
geometric treatises were read and used by French geometers of the stature of 
Chasles and Poncelet. [90, p. 145] Thus, though Maclaurin may not have been the 
towering figure Euler was, he was clearly a significant and respected mathemati- 
cian, and the Treatise of Fluxions was far more than an unread tome whose weight 
served solely to crush Bishop Berkeley. 


8. WHY A TREATISE OF FLUXIONS? The Treatise of Fluxions was not really 
intended as a reply to Berkeley. Maclaurin could have refuted Berkeley with a 
pamphlet. It was not a student handbook either; this work is far from elementary. 
Nor was it merely written to glory in Greek geometry. Maclaurin wrote several 
works on geometry per se. But he was no antiquarian. Instead, the Treatise of 
Fluxions was the major outlet for Maclaurin’s solution of significant research 
problems in the field we now call analysis. Geometry, as the examples I gave 
illustrate, was for Maclaurin a source of motivation, of insight, and of problem- 
solving power, as well as being his model of rigor. 

For Maclaurin, rigor was not an end in itself, or a goal pursued for purely 
philosophical reasons. It was motivated by his research goals in analysis. For 
instance, Maclaurin developed his theory of maxima, minima, points of inflection, 
convexity and concavity, orders of contact, etc., because he wanted to study curves 
of all types, including those that cross over themselves, loop around and are 
tangent to themselves, and so on. He needed a sophisticated theory to characterize 
the special points of such curves. Again, in problems as different as studying the 
attraction of ellipsoids and evaluating integrals approximately, he needed to use 
infinite series and know how close he was to their sum. Thus, rigor, to Maclaurin, 
was not merely a tool to defend Newton’s calculus against Berkeley—though it was 
that—nor just a response to the needs of a professor to present his students a 
finished subject—though it may have been that as well. In many examples, 
Maclaurin’s rigor serves the needs of his research. 

Moreover, the Treatise of Fluxions contains a wealth of applications of fluxions, 
from standard physical problems such as curves of quickest descent to mathemati- 
cal problems like the summation of power series—in the context of which, 
incidentally, Maclaurin gave what may be the earliest clear definition of the sum of 
an infinite series: “There are progressions of fractions which may be continued at 
pleasure, and yet the sum of the terms be always less than a certain finite number. 
If the difference betwixt their sum and this number decrease in such a manner, 
that by continuing the progression it may become less than any fraction how small 
soever that can be assigned, this number is the limit of the sum of the progression, 
and is what is understood by the value of the progression when it is supposed to be 
continued indefinitely.” [63, p. 289] Thus, though eighteenth-century Continental 
mathematicians did not care passionately about foundations, [38, pp. 18-24] they 
could still appreciate the Treatise of Fluxions because they could mine it for results 
and techniques. 


9. WHY THE TRADITIONAL VIEW? If the reader is convinced by now that the 
traditional view is wrong, that Maclaurin’s Treatise did not mark the end of the 
Newtonian tradition, and that not all of modern analysis stems solely from 
the work of Leibniz and his school, the question arises, how did that traditional 
view come to be, and why it has been so persistent? 

Perhaps the traditional view could be explained as follows. Consider the 
approach to mathematics associated with Descartes: symbolic power, not debates 
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over foundations; problem-solving power, not axioms or long proofs. The Cartesian 
approach to mathematics is clearly reflected in the work and in the rhetoric of 
Leibniz, Johann Bernoulli, Euler, Lagrange—especially in the historical prefaces 
to his influential works—and even Cauchy. These men, the giants of their time, 
are linked in a continuous chain of teachers, close colleagues, and students. Some 
topics, like partial differential equations and the calculus of variations, were 
developed mostly on the Continent. Moreover, the Newton-Leibniz controversy 
helped drive English and Continental mathematicians apart. Thus the Continental 
tradition can be viewed as self-contained, and the outsider sees no need for 
eighteenth-century Continental mathematicians to struggle through 750 pages of a 
Treatise of Fluxions, which is at best in the Newtonian notation and at worst in the 
language of Greek geometry. Lagrange’s well-known boast that his Analytical 
Mechanics [55] had (and needed) no diagrams, thus opposing analysis to geometry 
at the latter’s expense, reinforced these tendencies and enshrined them in histori- 
cal discourse. But the explanation we have just given does not suffice to explain the 
strength, and persistence into the twentieth century, of the standard interpretation. 
The traditional view of Maclaurin’s lack of importance has been reinforced by 
some other historiographical tendencies that deserve our critical attention. 

The traditional picture of Maclaurin’s Treatise of Fluxions radically separates his 
work on foundations, which it regards as geometric, sterile, and antiquarian, from 
his important individual results, which often are mentioned in histories of mathe- 
matics but are treated in isolation from the purpose of the Treatise, in isolation 
from one another, and in isolation from Maclaurin’s overall approach to mathe- 
matics. Strangely, both externalist and internalist historians, each for different 
reasons, have reinforced this picture. 

For instance, in the English-speaking world, viewing the Treatise as only about 
Maclaurin’s foundation for the calculus, and thus as a dead end, has been 
perpetuated by the “decline of science in England” school of the history of 
eighteenth-century science, stemming from such early nineteenth-century figures 
as John Playfair, and, especially, Charles Babbage. [77] [2] [4] Babbage felt strongly 
about this because he was a founder of the Cambridge Analytical Society, which 
fought to introduce Continental analysis into Cambridge in the early nineteenth 
century. This group had an incentive to exaggerate the superiority of Continental 
mathematics and downgrade the British, as is exemplified by their oft-quoted 
remark that the principles of “pure d-ism” should replace what they called the 
“dot-age” of the University. [5, ch. 7] [10, p. 274] The pun, playing on the 
Leibnizian and Newtonian notation in calculus, may be found in [2, p. 26]. These 
views continued to be used in the attempt by Babbage and others to reform the 
Royal Society and to increase public support for British science. 

It is both amusing and symptomatic of the misunderstanding of Maclaurin’s 
influence that Lacroix’s one-volume treatise on the calculus of 1802, [50] translated 
into English by the Cambridge Analytical Society with added notes on the method 
of series of Lagrange, [51] was treated by them, and has been considered since, as 
a purely “Continental” work. But Lacroix’s short treatise was based on the concept 
of limit, which was Newtonian, elaborated by Maclaurin, adapted by d’Alembert 
and L’Huilier, and finally systematized by Lacroix. [38, pp. 81-86] Moreover, the 
translators’ notes by Babbage, Herschel, and Peacock supplement the text by 
studying functions by their Taylor series, thus using the approach that Lacroix 
himself, in his multi-volume treatise of 1810, had attributed to Maclaurin. This is, 
of course, not to deny the overwhelming importance of the contributions of Euler 
and Lagrange, both to the mathematics taught by the Analytical Society and to 
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that included by Lacroix in his 1802 book, nor to deny the Analytical Society’s 
emphasis on a more abstract and formal concept of function. But all the same, 
Babbage, Herschel, and Peacock were teaching some of Maclaurin’s ideas without 
realizing this. 

In any case, the views expressed by Babbage and others have strongly influenced 
Cambridge-oriented writers like W. W. Rouse Ball, who said that the history 
of eighteenth-century English mathematics “leads nowhere.” [5, p. 98] H. W. 
Turnbull, though he wrote sympathetically about Maclaurin’s mathematics on one 
occasion, [88] blamed Maclaurin on another occasion for the decline: ‘“‘When 
Maclaurin produced a great geometrical work on fluxions, the scale was so heavily 
loaded that it diverted England from Continental habits of thought. During the 
remainder of the century, British mathematics were relatively undistinguished.” 
[89, p. 115] 

Historians of Scottish thought, working from their central concerns, have also 
unintentionally contributed to the standard picture. George Elder Davie, arguing 
from social context to a judgment of Maclaurin’s mathematics, held that the Scots, 
unlike the English, had an anti-specialist intellectual tradition, based in philoso- 
phy, and emphasizing “cultural and liberal values.” Wishing to place Maclaurin in 
this context, Davie stressed what he called Maclaurin’s “mathematical Hellenism,” 
[24, p. 112] and was thus led to circumscribe the achievement of the Treatise of 
Fluxions as having based the calculus “on the Euclidean foundations provided by 
[Robert] Simson,” [24, p. 111] who had made the study of the writings of the 
classical Greek geometers the “national norm” in Scotland. The “Maclaurin is a 
geometer” interpretation among Scottish historians has been further reinforced by 
a debate in 1838 over who would fill the Edinburgh chair in mathematics. Phillip 
Kelland, a candidate from Cambridge, was seen as the champion of Continental 
analysis, while the partisans of Duncan Gregory argued for a more geometrical 
approach. Wishing to enlist the entire Scottish geometric tradition on the side of 
Gregory, Sir William Hamilton wrote, “The great Scottish mathematicians, ...even 
Maclaurin, were decidedly averse from the application of the mechanical proce- 
dures of algebra.” [24, p. 155] Though Kelland eventually won the chair, the 
dispute helped spread the view that Maclaurin had been hostile to analysis. More 
recently, Richard Olson has characterized Scottish mathematics after Maclaurin as 
having been conditioned by Scottish common-sense philosophy to be geometric in 
the extreme. [70, pp. 4, 15] [71, p. 29] But in emphasizing Maclaurin’s influence on 
this development, Olson, like Davie, has overstated the degree to which Maclaurin’s 
approach was geometric. 

By contrast, consider internalist historians. The treatment of Maclaurin’s results 
as isolated reflects what Herbert Butterfield called the Whig approach to history, 
viewing the development of eighteenth-century mathematics as a linear progres- 
sion toward what we value today, the collection of results and techniques which 
make up classical analysis. Thus, mathematicians writing about the history of this 
period, from Moritz Cantor in the nineteenth century to Hermann Goldstine and 
Morris Kline in the twentieth, tell us what Maclaurin did with specific results, 
some named after him, for which they have mined the Treatise of Fluxions. [13, 
pp. 655-63] [35, pp. 126ff, 167-8] [49, pp. 522-3, 452, 442] They either neglect the 
apparently fruitless work on foundations, or, viewing it as geometric, see it as a 
step backward. It is of course true that many Continental mathematicians used 
Maclaurin’s results without accepting the geometrical and Newtonian insights that 
Maclaurin used to produce them. But without those points of view, Maclaurin 
would not have produced those results. 
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Both externalist and internalist historians, then, have treated Maclaurin’s work 
in the same way: as a throwback to the Greeks, with a few good results that 
happen to be in there somewhat like currants in a scone. Further, the fact that 
Maclaurin’s book, especially its first hundred pages, is very hard to read, especially 
for readers schooled in modern analysis, has encouraged historians who focus on 
foundations to read only the introductory parts. The fact that there is so much 
material has encouraged those interested in results to look only at the sections of 
interest to them. And the fact that the first volume is so overwhelmingly geometric 
serves to reinforce the traditional picture once again whenever anybody opens the 
Treatise. The recent Ph.D. dissertation by Erik Sageng [78] is the first example of a 
modern scholarly study of Maclaurin’s Treatise in any depth. The standard picture 
has not yet been seriously challenged in print. 


10. SOME FINAL REFLECTIONS. Maclaurin’s work had Continental influence, 
but with an important exception—his geometric foundation for the calculus. 
Mastering this is a major effort, and I know of no evidence that any eighteenth- 
century Continental mathematician actually did so. Lagrange perhaps came the 
closest. In the introduction to his Théorie des fonctions analytiques, Lagrange could 
say only, Maclaurin did a good job basing calculus on Greek geometry, so it can be 
done, but it is very hard. [58, p. 17] In an unpublished draft of this introduction, 
Lagrange said more pointedly: “I appeal to the evidence of all those with the 
courage to read the learned treatise of Maclaurin and with enough knowledge to 
understand it: have they, finally, had their doubts cleared up and their spirit 
satisfied?” [73, p. 30] 

Something else may have blunted people’s views of the mathematical quality of 
Maclaurin’s Treatise. The way the book is constructed partly reflects the Scottish 
intellectual milieu. The Enlightenment in Britain, compared with that on the 
Continent, was marked less by violent contrast and breaks with the past than by a 
spirit of bridging and evolution. [75, pp. 7-8, 15] Similarly, Scottish reformers 
operated less by revolution than by the refurbishment of existing institutions. [16, 
p. 8] These trends are consistent with the two-fold character of the Treatise of 
Fluxions: a synthesis of the old and the new, of geometry and algebra, of 
foundations and of new results, a refurbishment of Newtonian fluxions to deal with 
more modern problems. This contrasts with the explicitly revolutionary philosophy 
of mathematics of Descartes and Leibniz, and thus with the spirit of the 
mathématicien of the eighteenth century on the Continent. 

Of course Scotland was not unmarked by the conflicts of the century. During 
the Jacobite rebellion in 1745, Maclaurin took a major role in fortifying Edinburgh 
against the forces of Bonnie Prince Charlie. When the city was surrendered to the 
rebels, Maclaurin fled to York. Before his return, he became ill, and apparently 
never really recovered. He briefly resumed teaching, but died in 1746 at the 
relatively young age of forty-eight. Nonetheless, the Newtonian tradition in the 
calculus was not a dead end. Maclaurin in his lifetime, and his Treatise of Fluxions 
throughout the century, transmitted an expanded and improved Newtonian calcu- 
lus to Continental analysts. And Maclaurin’s geometric insight helped him advance 
analytic subjects. 

We conclude with the words of an eighteenth-century Continental mathemati- 
cian whose achievements owe much to Maclaurin’s work. [39, pp. 172, 412-425, 
590-597] The quotation [66, p. 350] illustrates Maclaurin’s role in transmitting the 
Newtonian tradition to the Continent, the respect in which he was held, and the 
eighteenth-century social context essential to understanding the fate of his work. 
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In 1741, Alexis-Claude Clairaut wrote to Colin Maclaurin, “If Edinburgh is, as you 
say, one of the farthest corners of the world, you are bringing it closer by the 
number of beautiful discoveries you have made.” 
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Yet Another Definition of Chaos 


Pat Touhey 


1. INTRODUCTION. The purpose of this article is to introduce yet another 
definition of chaos. The most popular and widely utilized definition is due to 
Devaney [2]. Namely, a self-map on a metric space X is chaotic on X if it has 
three essential ingredients: the periodic points of the map must form a dense 
subset of X, the map must have sensitive dependence on initial conditions, and the 
map must be topologically transitive. 

In a paper appearing in this Monthly [1] Banks et al. showed that the hypothesis 
concerning sensitive dependence is implied by the remaining two conditions. 
Crannell points out in another recent Monthly article [3] that this mathematically 
elegant result yields a somewhat less intuitive definition of chaos. Of the three 
conditions that Devaney states, sensitive dependence is clearly the most easily 
understood. In order to restore this lost sense of intuitiveness, Crannell suggests a 
slightly more natural concept, blending, as an alternative to transitivity. While the 
idea of blending seems to be quite interesting in its own right, it is clearly 
demonstrated in [3] that blending and transitivity are not equivalent. We therefore 
propose a new definition of chaos, equivalent to Devaney’s, and, we hope, just as 
natural. This definition arose several years ago in conversations with John Taylor 
about the article by Banks, et al. We reformulated the two topological conditions 
of transitivity and dense periodic points as a single condition that yields a simple, 
concise definition of chaos: A map f: X — X is chaotic on X if every pair of 
non-empty open subsets of X shares a periodic orbit. We use our definition to give 
a characterization of chaos that restores the lost sense of intuitiveness: A map 
f: X — X is chaotic on X if and only if it mixes together, via periodic cycles, any 
finite number of non-empty open subsets in infinitely many ways. 


2. CHAOS 


Definition 2.1. Given a non-empty set X and a mapping f: X — X, the forward orbit 
of x under f is the set Of (x) = {x, f(x), f*(x),...}, where f"(x) = f(f"-"(x)) for 
n > 1 with f(x) =x. 


Definition 2.2. Given a non-empty set X and a mapping f: X — X, x is a periodic 
point of f with primitive period n if f"(x) = x but f"(x) 4x fork =1,...,n — 1. 


We now define the concept of topological transitivity. 


Definition 2.3. Given a metric space X and a continuous mapping f: X — X, we say 
that f is transitive if for any two non-empty open subsets U and V of X, there exists 
some u & U and a non-negative integer k such that f*(u) € V, that is, every pair of 
non-empty open subsets of X shares a forward orbit. 


Although this is not the usual definition of transitivity, we have no doubt the 
reader can prove the following assertion of equivalence between it and the more 
common notion. 
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Proposition 2.4. f: X — X is transitive if and only if for any two non-empty open sets 
U,V <X there exists a non-negative integer k such that f*(U) NV # @. 


And now we turn to the major result of this article: Yet another definition of 
chaos. 


Definition 2.5. Given a metric space X and a continuous mapping f: X — X, we say 
that f is chaotic on X if given U and V, non-empty open subsets of X, there exists a 
periodic point p € U and a non-negative integer k such that f“(p) € V, that is, every 
pair of non-empty open subsets of X shares a periodic orbit. 


It now remains to show that our new definition of chaos is equivalent to the 
definition given by Devaney in [2]. By our remarks in the first section, it is 
sufficient to show that our definition is equivalent to the pair of conditions that f 
be transitive and have dense periodic points. 


Proposition 2.6. f: X — X is chaotic on X if and only if f is transitive and the periodic 
points of f are dense in X. 


Proof: If f is chaotic on X then every pair of non-empty open sets shares a 
periodic orbit. In particular, every non-empty open set must contain a periodic 
point so the periodic points of f are dense in X. The transitivity of f follows from 
the definition of chaos since every pair of non-empty open sets shares a forward 
orbit. 

Now let us assume that f is transitive and has a dense set of periodic points. 
Given any pair of non-empty open sets U,V Cc X transitivity ensures that there 
exists u © U and a non-negative integer k such that f*(u) € V. Now define 
W=f*V) OU. Note that W is open and non-empty since it is the intersection 
of two open sets and u is an element of both of them. It is also clear that W has 
the property that f*(W) CV. But the periodic points of f are assumed to be 
dense in X, so the non-empty open set W must contain a periodic point p. Thus 
we have shown that there exists a periodic point p © W CU with the property 
that f“(p) € f*(W) CV. This implies that f is chaotic. | 

It can now be shown that any finite number of non-empty open sets shares a 
periodic orbit whenever f is chaotic. 


Proposition 2.7. Let X be a metric space and let f: X — X be a chaotic mapping on 
X. Then any finite collection of non-empty open subsets of X shares a periodic orbit. 


Proof: Let N be the number of non-empty open subsets in our collection. If 
N = 1, the result follows from the density of periodic points; if N = 2 it follows 
from the definition of a chaotic mapping. We proceed by induction on N. Thus 
assume that the assertion holds for N =n. We will show that it holds for n + 1 
non-empty open subsets. 

There is no loss of generality to assume that the collection consists of n + 1 
disjoint subsets. If the sets are not disjoint then some pair of the non-empty open 
subsets intersects in an open subset. Replacing the pair by their intersection yields 
a collection of m non-empty open subsets that, by our induction hypothesis, shares 
a periodic orbit. Clearly this orbit is shared by the original collection of n + 1 
subsets. 

Now from our disjoint collection choose a subset and call it V. The remaining n 
subsets must share a periodic orbit and, like all periodic orbits, this orbit has a 
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primitive period, which we designate by M. From these remaining n non-empty 
open subsets choose any subset and call it U). Thus we must have p € U,, where p 
is a periodic point of primitive period M > n — 1, with the property that O;(p) 
intersects each of our n subsets. We now label each of the remaining n — 1 
non-empty open subsets in the following manner. As we iterate the point p, it first 
intersects one of the m — 1 open subsets for some value, k,, of the iterate, 
0 <k, <M. Let this subset be designated by U,, ie., f“'(p) € U,. Continuing in 
this fashion we arrive at the next iterate, f*(p), 0 <k, <k, <M, intersecting 
one of the remaining n — 2 open subsets. This subset is designated U,. Eventually 
we will have labeled each of the n open subsets so that f*(p) € U, for all 
i=0,1,...,1 —1,where0 =k, <k, < +: <k,_, <M. Now we define another 
collection of non-empty open subsets with a particularly nice property. Let W, = 
U,_,. Clearly f*»-1(p) € W,. Now consider 


W, = f Mri *n2l( Wy) Cr) U_>. 


We claim that W, is a non-empty open subset contained in U,_,. It is open 
because it is the intersection of two open subsets, and it is obviously in U,_,. That 
it is non-empty follows from the facts that f*»-1(p) © W, and f*-2(p) € U__,, 
and hence 


fkn-2(p) _ fen k nal fn ( p)) ef! Kn-1—Kn-2(W), 


which implies that f*:-2(p) € W,. Also note that W, has the particularly nice 
property that f[*:-1~*»-2l(W,) C W,. Continuing in this fashion, we define 


W, = fro *n oo (Wi) OU, Gay fori=1,2,...,n-1. 


Each W; is again non-empty, open, and contained in U,_(;,1). In addition we have 
the particularly nice property that 


filtro n-csnl(W,) Cc W._, for i = 1,2,...,n — 1. 


It is easy to find a periodic orbit that wends itself through our original collection of 
n + 1 non-empty open subsets, {V, U), U,,...,U,_,}. Since V and W,_, are both 
open, they share a periodic orbit. Thus, there exists a periodic point p’ € V and a 
positive integer g such that f?(p’) € W,_, C U,. But then, by our particularly nice 
property, the subsequent iterates of p’ must pass through all of the U,’s. 


FA(p') = fel p') € WC Uy 
frte(p'y = fltr~kol fla+*ol( p’)) Ee fi *ol(W,_,) CW,_, CU, 


frr*(p') = fl heil( flat®-il¢ p')) Ee fii kil(W_,) Cc W, city Cc U, 


fatke(p') = fltnikn-2l flatkn-21( p’)) ] flkna~kn 2 W,) Cc Wi = U_.. 
Thus, the forward orbit of p’, O;"(p'), intersects each of V, Up, U,,...,U,-1. | 
Corollary 2.8. Let X be a metric space and let f: X — X be a chaotic mapping on X. 


Then any finite collection of non-empty open subsets of X shares infinitely many 
periodic orbits. 
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Proof: Assume the existence of a finite collection {U;};_, , of non-empty open 
subsets that share only a finite number of periodic orbits. Define P to be the set 
consisting of the union of the points in these shared periodic orbits. Since each 
periodic orbit contains a finite number of points, the union of finitely many such 
orbits must be finite. Hence P is a finite set. We now define another collection of 
non-empty open subsets {V;},_, , by Vi; =U,\P. It’s clear that each V, c U.. 
And each V, is non-empty and open since removing the finite set of points, P, 
from the open set U, leaves us with a non-empty open set. Thus by Proposition 2.7 
there must be a periodic orbit shared by the collection {V;};_, __,. This new orbit 
is clearly not contained in P. On the other hand, this orbit obviously passes 
through the original collection {U};_, _,, of non-empty open subsets since each 
V, c U.. This contradiction proves our result. | 


Proposition 2.9. Let X be a metric space and f: X — X be a mapping. The following 
are equivalent: 


i) f is chaotic on X. 
ii) f is topologically transitive and has a dense set of periodic points. 
iii) any finite collection of non-empty open sets of X shares a periodic orbit. 
iv) any finite collection of non-empty open sets of X shares infinitely many periodic 
orbits. 


Proof: We have already shown that i) = ii) => iii) = iv). 

If any finite collection of non-empty open sets contained in X shares infinitely 
many periodic orbits it is clear that any pair of open sets shares a periodic orbit. 
Thus iv) => 1). a 
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... 1 drove up the mountain and found a dairy, bought some milk, 
and asked permission to camp under an apple tree. The dairy man 
had a Ph.D. in mathematics, and he must have had some training in 
philosophy. He liked what he was doing and he didn’t want to be 
somewhere else—one of the very few contented people I met in my 
whole journey. 


John Steinbeck, Travels with Charley 
Viking Press, New York, 1962, pp. 25-26 
Contributed by Harold P. Boas, Texas A & M University 
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Hereditary Classes of Operators 
and Matrices 


Scott A. McCullough and Leiba Rodman 


1. INTRODUCTION. A recent problem in this Monthly [19] asks whether the 
identity 
A*A = A? (1.1) 


implies that the matrix A is hermitian. Here A* denotes the conjugate transpose 
of the matrix A. The answer is yes (of course, only the case of singular A is 
nontrivial), as can be proved by elementary methods: By Schur’s triangularization 
theorem, which asserts that every square size complex matrix is unitarily similar to 
an upper triangular matrix [17, Theorem 2.3.1], we may assume that A itself is 
upper triangular. Inspection of the diagonal entries of A*A = A’ shows that all 
the diagonal entries of A are real and all the off-diagonal entries are zero. Thus, 
A is Hermitian, since it is unitarily similar to a real diagonal matrix. 

Consider now a “symmetrized” version of (1.1) in which the squares of both A 
and A* appear symmetrically: 


A? — 2A*A + A** = 0. (1.2) 


Clearly, any matrix A that satisfies (1.1) also satisfies (1.2). Does (1.2) imply that A 
is hermitian? The answer is again yes, and, moreover, it is yes also for a bounded 
linear operator acting on an infinite-dimensional Hilbert space (Theorem 3.1). 
Thus, in the algebra of bounded linear operators on a fixed Hilbert space, the 
three operator identities (1.1), (1.2), and A = A* are equivalent. 

This result is a very special case of the theory of hereditary classes of bounded 
linear operators developed in [2]. The purpose of this paper is to expose the 
general framework of hereditary classes, including the main ideas, results, several 
particular cases, examples, open problems, and applications; this will be done in 
Section 4. In the next two sections we study certain hereditary classes of particular 
interest (n-selfadjoint operators) in detail. Here we mention only that many 
hereditary classes are defined in terms of operator identities of the form 


p(A*, A) = 0, (1.3) 


where p(A, 1) is a polynomial with complex coefficients in two non-commuting 
variables A and pw such that in every summand of p(A, w) the powers of A (if any) 
appear on the left of the powers of yw (if any). For example, the class of all 
bounded selfadjoint operators is a hereditary class associated with the polynomial 
P (A, pb) =A- Hb. 

In a broad and admittedly imprecise sense, the theory of hereditary classes can 
be thought of as a “noncommutative” (namely, an operator and its adjoint do not 
necessarily commute) spectral theory, which generalizes and extends the well-known 
“commutative” spectral theory for bounded selfadjoint, or, more generally, normal, 
operators. 
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Section 5 is devoted mainly to several classes of operators that are close in a 
certain sense to selfadjoint operators, in spaces with an indefinite inner product. 
The results there are a modest beginning of a theory of hereditary classes in the 
context of indefinite inner product spaces. Since this is essentially uncharted 
territory, there are many open problems, and we state a few of them. 


2. n-“SELFADJOINT OPERATORS: THE FINITE DIMENSIONAL CASE. To 
start with, we consider a natural generalization of the formula (1.2). A bounded 
linear operator A on a Hilbert space A is called n-selfadjoint if 


Ley (p)aan = 0, (2.1) 


where n is a fixed positive integer; A° and A*° are interpreted as the identity 
operator J. Obviously, 1-selfadjoint operators are just selfadjoint. We immediately 
make the following simple observation that will be used later on: 


Proposition 2.1. Jf A is n-selfadjoint, then for every real number jw the operator 
A — pl is n-selfadjoint as well. 


Proof: Write 


Ly (EA — pl)**(A — pl)" 


-D(-y {tartare LAMA pg 1H), 
k=0 ptq<n 


where f(p, q, w) is a certain number. We have 
k _ —k n—-k-— 
fovany = L-(E] Rw (gem 


where the summation is over all integers k such that p <k <n -—q. An easy 
calculation shows that 


forgo Eola) s 


_p- n! n74 n-gq- 
n—p-q k q—Pp 
=(-—p ——__—__——— )) (-1 | | = (0. 

(#4) (n —p—q)!p!q! Z' k —p 

It turns out that the class of n-selfadjoint operators is hereditary with 
pam) =D (1 (Pais 
k=0 

The structure theorem for n-selfadjoint matrices asserts the following: 


Theorem 2.2. An operator A on a finite dimensional Hilbert space is n-selfadjoint if 
and only if 


A=T+N, (2.2) 


where T is selfadjoint, N ("Sl = 0, and T and N commute. (r] denotes the integer part 
of r). 
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An interesting feature of this result is that for any integer k > 0, the 
2k-selfadjoint matrices and the (2k — 1)-selfadjoint matrices are exactly the same 
class. As we will see, the same is true for operators. We verify in Section 3 that 
every operator of the form (2.2), in which T and N have the properties indicated 
in Theorem 2.2, is n-selfadjoint, even when the underlying Hilbert space is infinite 
dimensional. It turns out that the converse is nearly true (at least, for the 
3-selfadjoint operators). This is also discussed in Section 3. 

We now prove the difficult direction “only if’ of Theorem 2.2, using elementary 
linear algebra. Thus, we assume throughout the rest of this section that H = C?, 
the vector space of columns with p complex components, and that A is a p X p 
matrix that satisfies (2.1) (for a fixed n). 


Lemma 2.3. [fx and y are eigenvectors of A corresponding to the eigenvalues » and p, 
respectively, then 


(A — p)y*x = 0. (2.3) 
Proof: We have Ax = Ax and y*A* = py*. Thus 


0 -y'( ED (-y'(f)atar|s 


— Laer a = (A By'y*a, 


and (2.3) follows. a 
The next observation is an immediate consequence of Lemma 2.3. 


Lemma 2.4. All eigenvalues of A are real. 


Indeed, if A were a non-real eigenvalue of A, then taking w = A, y =x in (2.3) 
would lead to a contradiction. 


Lemma 2.5. Jf A and wp are distinct eigenvalues of A, then the root subspaces 
Ker(A — Al)? and Ker(A — pl)? are orthogonal. 


Proof: Let y be an eigenvector of A corresponding to the eigenvalue p (neces- 
sarily real, by Lemma 2.4). Then for every x € C” we have 


sr( Bento 


-»"| » c—1)'( ular ® x =y*( pl — A)’ x. 
k=0 
Thus, y | Im( uJ — A)”. The nonincreasing sequence of subspaces 


Im( wl — A) DIm( pl -— A)? 2D - 


must stabilize because of the finite dimensionality of C?. Let g be a positive 
integer such that 


Im( wl —.A)* = Im( pl —.A)’ 


for all r > gq. We certainly have y L Im(pJ — A)?. If we have proved already 
that (A — plu L Im( pl — A)? for j = 0,1,..., and if v € C” is such that 
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(A — pl)v = u, then for any y € Im( wl — A), Proposition 2.1 ensures that 
0 = | y. (-1)* flee — pl)*(A — pl)" |y = v*(A — pl)” y. 
k=0 


Thus, v L Im( pl — A)"*4 = Im( pl — A)?. Starting with the eigenvectors of 
A corresponding to yp, induction and the preceding argument show that 
Ker( uJ — A)? 1 Im( pl — A)%. Since Ker(A — AI)? C Im( pl — A)?4, 


Ker(.A — AI)” 1 Ker(.A — pl)?” 


follows. | 


In view of Lemma 2.5, each root subspace Ker(A — AJ)? is A*-invariant (as 
well as A-invariant). Thus, we need to prove Theorem 2.2 for the restriction of A 
to each root subspace Ker(A — AJ)? separately. In other words, we may assume 
that A has only one eigenvalue (possibly of high multiplicity), and in view of 
Proposition 2.1 we may further assume that this eigenvalue is 0, i.e., A is nilpotent. 
Thus, the proof of the “only if’ part of Theorem 2.2 reduces to the following 
lemma. 


Lemma 2.6. Jf A is nilpotent and satisfies (2.1), then Al's] = 0, 
Proof: Arguing by contradiction, it follows easily from the Jordan form of A that 


there exists a vector x in C? such that Al? |; = y #0 and A’x = 0 for all 
q > [74]. Assume first that n is even, say n = 2m. Then 


0 -e(D(-p'(g)atan')s 


a contradiction. If n is odd, say, n = 2m + 1, then the preceding formula has the 
analog 


0 = (Any 3 1)" )anars 


= Ye(-ay'{R)rartarts = ("2 yy, 
k=0 


a contradiction with y # 0. = 


For the special case of 3-selfadjoint matrices, Theorem 2.2 can be recast in 
terms of a canonical form under unitary similarity (two matrices X and Y are 
called unitarily similar if X = UYU* for some unitary matrix U). This result 
(Theorem 2.7) is in the spirit of the spectral theory and illustrates how Theorem 
2.2 can be thought of as an extension of the familiar spectral theorem for 
Hermitian matrices. An operator X on a finite dimensional Hilbert space is called 
indecomposable if there is no nontrivial subspace that is invariant for both X 
and X™*. 
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Theorem 2.7. The 1 X 1 matrices [c] and 2 X 2 matrices F 


d > 0) are the only (up to unitary similarity) indecomposables in the class of 
3-selfadjoint matrices. Moreover, every 3-selfadjoint matrix A is unitarily similar to an 
orthogonal sum of indecomposables; this orthogonal sum is uniquely determined, up to 
permutations of the orthogonal summands, by A. 


‘| (where c is real and 


This follows easily from Theorem 2.2 using the following fact: Every p X p 
matrix N such that N? = 0 is unitarily similar to a unique (up to a permutation of the 
orthogonal summands) orthogonal sum 


0 d, 0 d 
e:: @® “| @10| @-:: @ [0], 2.4 
SS 4) 9 [ole # (0) (2.4) 
where d,,...,d, are positive numbers. The decomposition (2.4) follows from repre- 
senting N as a block matrix 
0 0 N, 
0 0 O 
0 0 O 


with respect to the orthogonal decomposition 


C’ = ImN @ (KerN © ImN) © (KerN)~ , 


and a subsequent reduction of N, to a positive diagonal matrix by unitary 
equivalence (N, — UN,V, where U and V are unitary matrices; this is the singular 
value decomposition of N,). The uniqueness of (2.4) follows from observing that 
s = rankWN, and d,,...,d, are the nonzero singular values of N. 


3. n-SELFADJOINT OPERATORS: THE INFINITE DIMENSIONAL CASE. The 
development of the theory of n-selfadjoint operators in infinite dimensional 
Hilbert spaces was motivated largely by striking and unexpected intimate connec- 
tions with differential equations, particularly conjugate point theory and disconju- 
gacy. These connections were first observed by Helton [16], and further exploited 
in [3]. Helton [15, 16] introduced the 3-selfadjoint operators (or, more precisely, 
3-selfadjoint operators having a cyclic vector) as models for the “multiplication by 
x” operators on weighted Sobolev spaces supported on a compact interval [a, b] 
(see the example given after Theorem 3.3). 

The theory of n-selfadjoint operators acting on infinite dimensional Hilbert 
space differs significantly from that of n-selfadjoint matrices. In particular, Theo- 
rem 2.2 fails in general in infinite dimensions. In this section we discuss some 
aspects of the available results concerning infinite dimensional n-selfadjoint opera- 
tors. Throughout the rest of the paper, we'll use “(H) to denote the bounded 
linear operators on the Hilbert space H and by operator we’ll mean an element of 
L(A) for some H. 


Theorem 3.1. An operator A € &(H) is 2-selfadjoint if and only if A is selfadjoint. 


In other words, the result of Theorem 2.2 holds for infinite dimensional 
operators if nm = 2. The authors of [2,15,8] were well aware of Theorem 3.1, 
although none of these papers state it explicitly. We will see that Theorem 3.1 is a 
byproduct of more general results on operators in spaces with an indefinite scalar 
product (specifically, Theorem 5.1). 
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Let’s agree to call an operator J €.&(H) a Jordan operator if J = T + N, where 
T is selfadjoint, N* = 0, and TN = NT. If J € (A) is a Jordan operator, then J 
is a 3-selfadjoint operator. We say that an operator A is n-Jordan if A has the 
form (2.2). 


Theorem 3.2. Every n-Jordan operator is n-selfadjoint. 


Proof: Suppose J = T+ N with T selfadjoint, N” = 0, and TN = NT. We shall 
show that J is 2m + 1-selfadjoint. We have 


may [wire (3.1) 


j=0 
Using (3.1) and the properties of T, N, we obtain 


an*t [> +] 


>) (-1)" | ekgaetin 
m=0 


m 
n 2n+1-j 
= >) N®IN T2241 -i-! > "(Pr ty) amet) 
', 1=0 m=l m l J 
i, 
The summation on m simplifies to 
(2n + 1)! “y! 1” 1 
lt]! mal (~}) (2n+1—m-—j)'(m-T1)! 


This last summation is of the form + L?_)(—1)%(q!(p — q)!)~* which is zero. We 
conclude that J is 2n + 1-selfadjoint. | 


If J is a Jordan operator and M Cd _ is an invariant subspace for J, then 
A =J\|y is also a 3-selfadjoint operator. If H is finite dimensional, then A is again 
a Jordan operator, but in general (i.e., if H is infinite dimensional) A need not be 
a Jordan operator. The following theorem clarifies the relation between Jordan 
and 3-selfadjoint operators. An initial version of this theorem is due to Helton [15]. 
Agler [1] established the general result. 


Theorem 3.3 (3-selfadjoint lifting). [fA is a 3-selfadjoint operator on a Hilbert space 
H, then there exists a Jordan operator J on a Hilbert space K containing H such that H 
is invariant for J and A = J\y. 


In other words, A lifts to J (see the next section for a more detailed discussion 
of this notion). If, however, both A and A”* are 3-selfadjoint, then A is Jordan (a 
result proved in [15)]). 

The following example illustrates Theorem 3.3. Let H denote the Hilbert space 
obtained by taking the closure of the continuously differentiable functions on [0, 1] 
in the norm induced by the inner product 


(p,q) = [°F + pq)dt. 


It is an entertaining exercise to verify directly that the operator A of multiplication 
by t on A, Af(t) = ¢tf(t), is bounded and 3-selfadjoint, but this fact will follow 
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shortly. Let Y denote the operator of multiplication by ¢ on L? [0,1] and define 
_{|Y I 
r= 15 
acting on K = L’ [0,1] © L’ [0,1], where J denotes the identity operator. Define 
V: H > K oncontinuously differentiable functions (so V is densely defined) by 
p 
Vp = , 
° 4 
It is easy to see that V is an isometry and thus extends to an isometry, also 
denoted by VY, on H. Further, VA =JV, so that VH is invariant for J, and, 
identifying VH with H, A =J|y. Since J is bounded, so is A. 

This last example also shows that not every 3-selfadjoint operator is Jordan. To 
see this begin by noting that if J = T+ N is Jordan, then J* = T + N® is also 
Jordan and therefore 3-selfadjoint. The norm on H dominates the supremum 
norm, and hence for each 0 <x < 1, there is some k, € H such that (p,k,) = 
p(x), for each polynomial p. (One could also show directly the existence of k,; in 


particular, the formulas for k, and k, are given below). A routine argument shows 
A*k,, = xk,. We can thus compute 


3 2 3 
((A*? — 34A*? + 3.A24* — A*)k,, ky) = (x —y)* ky, ky). 


Hence, if A* is 3-selfadjoint, then (k,,k,) = 0, as long as x # y. But notice that 
ky = (e — 1)7'(exp(t) + exp (2)exp(—t)) and k, = (cosh(1))~‘cosh(t); these for- 
mulas for k, and k, can be verified using integration by parts to show that 
Cf, ky> = f(O) and ¢f,k,> =f() for any continuously differentiable function f on 
(0, 1]. However, (ky, k,> # 0. 

Another interesting property of n-selfadjoint operators is the following: 


Theorem 3.4. For every positive integer k, the classes of 2k-selfadjoint operators and 
(2k — 1)-selfadjoint operators coincide. 


It is possible to give a proof of Theorem 3.4 based upon the techniques of [15]. 
Note that Theorem 3.1 is a particular case of Theorem 3.4. 


4. HEREDITARY CLASSES AND FAMILIES OF OPERATORS. In this section 
we present an informal introduction to a part of the work of J. Agler on families of 
operators and hereditary polynomials, borrowing freely from [2], and focusing on 
the relation between a family and its boundary. We'll illustrate the basics of the 
abstract theory with a family of 3-selfadjoint operators and a local family of 
contraction operators as examples. This local family of contractions is essentially 
finite dimensional. For the reader who prefers to think in terms of matrices, this 
provides an introduction both to families of operators and to a class of operators 
that is preeminent in operator theory. We’ll first discuss hereditary polynomials 
and then introduce the definition of a family. 

An hereditary polynomial is a polynomial in two noncommuting variables x, y of 
the form 


pP(x,y) = ejy!x'. 
Given an operator A, define 


P(A) = Cj; ATA’, 
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If q(x, y) is any polynomial then there is a unique hereditary polynomial p(x, y) 
such that p(z,w) = q(z, w) for all complex numbers z, w. In this case, define 


q( A) = p(A). 
For instance, if q(x, y) = (x — y)’, then 
q( A) = A** — 3.A**A + 3 A*A? — AP, 
More generally, if q(x, y) = (x — y)”, then 


q(A) = X-y'(p ana’ 


Thus, A is an n-selfadjoint operator if and only if (x — y)"(A) = 0. 
An operator X on a Hilbert space H is positive, denoted X = 0, if (Xh,h> = 0 
for every h € H. Thus, A is n-selfadjoint if and only if 


+(x —y)"(A) > 0. 
Given A €&(H)a subspace M of H (all subspaces are closed) is invariant for A 


provided AM CM. The restriction of A to M is denoted Aly. The following 
lemma expresses the distinguishing property of hereditary polynomials. 


Lemma 4.1. Let A be an operator on a Hilbert space H and suppose M is invariant for 
A. If p is an hereditary polynomial and p(A) > 0, then p(Aly) = 0. 


This lemma is a consequence of the fact that, for h,k € M and p an hereditary 
polynomial, ¢ pCAly)h, k) = « pC(ADh, k>. Here are the details with h = k. Writ- 
ing p(x,y) = )ic,,y/x' and given h © M, we have 


(p(Al)h, hy = Lei<(Alu)*/( Ala) 'A, bY 
= Ye,;<(Alu)'h, (Alu )/A)D 
= we Ah, Ath) 
= Vic, « A*Ath, h) 
= (p(T)h,h), 
where in the third equality we use the fact that M is invariant for A. 
Given a collection y of hereditary polynomials, let 4 denote the collection of 
operators A € (HH) such that p(A) = 0 for every p € y. For reasons that will 


become clear, we will always assume that there is a positive constant c such that 
2 
co —yxey. 


Lemma 4.2. Jf y is a collection of hereditary polynomials and there exists a positive 
constant c such that c* — yx © y, then F, satisfies 


(i) F, is closed with respect to orthogonal direct sums; i.e., if A, €4,, then 
®, A, EF; 

(ii) F, is closed with respect to unital *-representations; i.e., if A © %, Z(H) 
and vw is a *-homomorphism mapping I into I from the norm closed 
subalgebra generated by A, A* and I into “(K), then 1(A) © £; 

(iii) F, is closed with respect to restriction to invariant subspaces; i.e., if A € A, A 
(HH) and M C H is invariant for A, then Aly € F;; 

(iv) ¥, is bounded. In fact, if A © F,, then the operator norm of A does not 
exceed c. 
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The condition (iii) is a consequence of Lemma 4.1. The hypothesis c? — yx € y 
directly implies (iv). The conditions (i) and (ii) are easily checked and, while 
important, will not be emphasized here. We say, as shorthand for the conclusion of 
Lemma 4.2, that % is a bounded collection of operators that is closed with respect 
to orthogonal direct sums, unital *-representations, and restrictions to invariant 
subspaces. Any bounded collection of operators ¥ that is closed with respect to 
orthogonal direct sums, unital *-representations, and restriction to invariant 
subspaces is called a family. While we haven’t defined matrix valued hereditary 
polynomials, a theorem of Agler [2] says if Y is a family, then there is a collection 
of matrix valued hereditary polynomials [ such that ¥ =.F,. 

We now introduce a family of operators that serves as an example throughout 
the remainder of this section. An operator C on a Hilbert space H is a contraction 
if its operator norm is at most one. Equivalently, C is a contraction if 


(1 —yx)(C) =I-C*C = 0. 
The collection of all contractions forms a family of operators that we denote by @. 
We can obtain a family of contraction operators whose members behave like 


matrices by, in Agler’s terminology, localization. To keep things simple, fix distinct 
points 0 = Ag, A;,..., A, with |A,| < 1 and let 


m(z) = I1(z — A,). 
Let & = ,, where y = {1 — yx, m(x), —m(x)}. Evidently, C € & if and only if C 
is a contraction and m(C) = 0. 


Define 
Mie j(2 — Ai) 
mM, ry a ar 
for 7 = 0,...,n. The following’ well known lemma expresses the finite dimensional 


nature of elements of &. 


Lemma 4.3. If X is an operator on a Hilbert space H and m(X) = 0, then m(X) 
satisfy 


(ii) m(X)m(X) = 0 ifi 4 j; 
Gii) Lm X) =I; and 
(iv) Xm,(X) = Am (X). 


Proof: Let r(z) = £m,(z) — 1. Since r is polynomial of degree n and r(A,) = 0 
for each j = 0,1,...,a, it follows that r = 0. Thus, 


I= yim,(X). 
Noting that m divides m,m, if i # j establishes Gi). Combining (ii) and (iii) proves 
(i). To verify (iv), notice that since m(X) = 0, (CX — A,Jm,(X) = 0. | 


Hereditary polynomials and families are ideal for studying lifting theorems. 
Suppose A €.4(H) and J €“(K) where K contains H. If H is invariant for J, 
and A = J|y we say A lifts to J. Geometrically, the condition that A lifts to J can 
be described as a block operator matrix 


J = 4 * (4.1) 
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with respect to the orthogonal decomposition of K as H ®(K © #1). Often H is 
not explicitly a subspace of K; rather, there exists an isometry V: H > K and 
we identify H with VH. In this case, the condition that H is an invariant subspace 
for J and A =J\|y becomes VA = JV. For instance, Theorem 3.3 says that a 
3-selfadjoint operator lifts to a Jordan operator. 

A distinguished member of the class @, is (up to unitary equivalence) the 
(n + 1) X (n + 1) matrix E acting on the Hilbert space K with basis the eigenvec- 


tors k;, j = 0,1,...,n satisfying Ek; = A,;k;, and 


Jovy? 


1 
(kj, k;) 1 ha, 

Operator theorists know E£ as the restriction of the adjoint of the unilateral shift to 
the span of the kernel functions {k(-, A,)}, where k(z, A) = (1 — zd) | is the Szego 
kernel. 

Let E* denote the orthogonal direct sum of FE with itself a@ times. The 
following is a version of the Sz.-Nagy Dilation Theorem [21,22]. The proof 
indicated below is a version of the deBranges-Rovnyak construction [9]. 


Theorem 4.4. If Ce @ AAA), then C lifts to E%, where a is (at most) the 
dimension H. If T is in @ ALA), then T lifts to W*, where W is an isometry. 


Proof: The second part of the theorem is well known and there are numerous 
proofs. We refer the reader to [22]. 

It is possible to obtain the first part of the theorem from the second part. 
However, we choose to proceed directly giving a construction with the property 
that if C € & 1 L(A) and the dimension of H is finite, then all the Hilbert 
Spaces involved in the construction have finite dimension. It will be convenient to 
represent EF“ as a tensor product. Given Hilbert spaces H and H’, a Hilbert space 
H ® HA’ is constructed by defining the inner product of elementary tensors, 


heh',k@k') =6h,k)<h,k’'), 


and taking the closure, in the induced norm, of all finite linear combinations of 
elementary tensors. If X € Y(H) and X’ € “(H’), we obtain an operator X @ X’ 
by defining 


(X@X'\(h@l) =Xnhe@x'n. 


In particular, letting / denote the identity on H, we have E* =] ® E; ie., on 
elementary tensors, 


E*h ®k =h ® Ek. 


In particular 
Ech @k, = Aj;h @ k,. 


Now let D denote the positive semidefinite square root of J — C*C. Define 
V: H'>H ®K by 


Vh = YDm,(C)h ® k,. 
J 


A direct computation shows that (Vh, Vh'>) = ¢h,h’> for all h, h’ © H. Thus V is 
an isometry. Moreover, VC = UI ® E)V. a 
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Given a family Y and some A &€.¥Y, we would like to find a J & FY, distinguished 
in some way, such that A lifts to J. It is also reasonable to expect that there are 
certain elements of ¥ that we cannot lift in a nontrivial way. An operator A € F is 
called extremal if whenever A lifts to J € ¥ as in (4.1), it follows that X = 0. 
Algebraically, A is extremal if whenever there exists a J © ¥ and isometry V such 
that VA = JV, it follows that VA* =J*V. 


Theorem 4.5. A € &, is extremal in @, if and only if A = E*% for some a. T € @ is 
extremal if and only if T* is an isometry. 


Proof: We give a proof of the second part only, since it is somewhat easier than 
the first part, it illustrates the basic approach, and, while it is known, is not usually 
expressed in this way. 

Suppose T* is an isometry and 


a 


is in &. Since J* is then in @, it follows that 
I — TT* — XXx* —XY* 
—Yx* I-yYyy*} 


Since J — TT* = 0, it follows that X = 0. We conclude that T is extremal. 
Conversely, suppose 7* is not an isometry. Then there exists a nonzero vector x 
such that J — TT* — xx* > 0. Let 


0<1—* =| 


_|T x 
y= H si 
Compute 
aye I-TT* -x* 0O 
I-JJ 0 0 | 
from which it follows that J € &. Hence in this case T is not extremal. | 


Agler showed in [2] that if Y is a family and A €.F, then there exists J © F 
extremal such that A lifts to J. A consequence is the following theorem, which is a 
cornerstone of Agler’s abstract approach to model theory. A subspace M of the 
Hilbert space H is reducing for J €- (A) if M is invariant for both J and J*. In 
this case J decomposes as a direct sum J, ® J, with respect to the orthogonal sum 
H=Me(HeM). 


Theorem 4.6. Given a family F, there exists a subcollection dF of FY such that 


(i) dF is closed with respect to orthogonal direct sums; 
(ii) &F is closed with respect to unital *-representations; 
(iii) &F is closed with respect to restriction to reducing subspaces; i.e., if ] €- L(A) 
\ dF and M is a reducing subspace for J, then J\y € dF; 
(iv) For each A €& F, there exists J € oF such that A lifts to J. 
(v) If B is any subcollection of F satisfying (i), (ii), (iii), and (iv) above, then 
OF CB. 


Any collection & satisfying the conditions (i), (ii), Gii), and (iv) of Theorem 4.6 
is called a model for ¥. The collection dF is the boundary (often called the Agler 
boundary) of the family ¥ The boundary of ¥ always contains the extremal 
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elements of Y and in many cases the extremals in ¥ constitute the boundary of .F. 
For example, Theorem 4.4 says that 4, = {E*: a} are the extremals of &. On the 
other hand, &, obviously satisfies (i), Theorem 4.4 implies that it satisfies (iv), and 
the proof of Theorem 4.4 shows that it satisfies (ii) and (iii). Thus &% = 0%. 
Similar reasoning shows that the boundary of @ is the collection of adjoints of 
isometries, commonly called coisometries. 

Description of the boundary and of the extremal elements for a given family is a 
fundamental problem in the theory and applications of hereditary classes. So far, 
this problem has been satisfactorily solved only for very few (but important) 
families. 

The general theory of hereditary classes, families, their boundaries and extremal 
elements, provides a paradigm for several important special classes in operator 
theory: (a) isometries and unitaries (Wold decomposition); (b) contraction and 
co-isometries (Theorems 4.4 and 4.5); (c) contractive subnormal and contractive 
normal operators (see [11] for a thorough account of the theory and applications of 
subnormal operators); (d) numerical radius contractions (suggested by the recent 
work [13]). 

There are many open problems in the theory of hereditary classes, both 
concrete and of general nature. We would like to emphasize here problems 
concerning special features that matrices (i.e., operators on finite dimensional 
Hilbert spaces) might have. For example: 


Problem 4.7. Js there a simple condition (say, on the hereditary polynomials that 
determine the family) that guarantees that the matrices in the boundary are automati- 
cally extremal? More generally, when does the set of extremals coincide with the 
boundary? 


Problem 4.8. Suppose F is a family with the boundary 0¥. When does every matrix in 
F lift to a matrix in 0F? 


We end this section by returning to the 3-selfadjoint operators. Recall that the 
class of 3-selfadjoint operators is not a family, since it is not bounded. However, by 
imposing some canonical normalization we do obtain a family. Let p(x, y) denote 
the hereditary polynomial — >(y — x)* and fix an interval [a, b]. Then the set of 
3-selfadjoint operators A with spectrum in a given interval [a, b] and such that 


P(A) <c? 


is a family that we denote by “=~, 4, .. If J = T + N is a Jordan operator, then 
direct computation shows that J €.¥ if and only if the spectrum of J is in [a, b] 
and N*N <c’. 


Theorem 4.9. The boundary 0Y of S consists of precisely those Jordan operators 
J=T+WN &-Y, where 
N*N + NN* = c7I, 
Moreover, each J © 0f is extremal. 
If J/€/= S, 4,- 18 a matrix, then it is possible to give a simple construction 
of the lift of J to a matrix J, in the boundary of .~. For simplicity, assume that 


c = 1 and that J is completely nonselfadjoint, i.e., there is no nontrivial subspace 
that is invariant for both J and J* and on which J is selfadjoint. Then J is (up to 
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unitary similarity) an orthogonal sum of matrices of the form 


J= 4 i (4.2) 


where ||X||<c = 1, and t € [a, b] (cf. Theorem 2.7). The complete nonselfad- 
jointness of J guarantees that X is onto. Let D = vJ — X*X, the defect of X. 
Define 


V:C? €@C? 5 CP ECP EC’ GC?, 
where p X p is the size of J in (4.2), by 


Df 

g 0 

V = 
fl-| ¢ 
Xf 
Let 

i JI 0 QO 
jy -|9 # 0 0 
0 0 0d TI 
0 0 0 ff 


One verifies that V is an isometry, and VJ = J,V. Clearly, J) is in the boundary of 
S as described in Theorem 4.9. 
As a final remark we note that an operator T satisfying 
n n* 
r"'T" =1+ 5(-3 + 4T*T — T*?T?) + > (1 - 2T*T + T*’T?) 


is called a 3-isometry. If T is 3-selfadjoint, then exp(isT ) is a 3-isometry for real s. 
The notion of 3-isometry can be generalized in a natural way to m-isometry, by 
requiring that T*”"7” is a polynomial of degree m — 1 as a function of xn. It turns 
out that there exist 2-isometries that are not isometries, in contrast to the situation 
for 2-selfadjoint operators. Very important advances in the theory of classes of 
m-isometries, including applications to differential operators, disconjugacy, and 
Brownian motion have been obtained recently in [4]. 


5. SELFADJOINT OPERATORS AND RELATED CLASSES IN KREIN SPACES. 
Let H be a Hilbert space (over the complex numbers), and let J be a selfadjoint 
operator on H such that J* = J. Consider the sesquilinear form [-,-] induced 
by J: 

[x,y] = (Jey), x,y © A, 


where ¢-,: ) denotes the inner product in H. The corresponding quadratic form 
[x, x] is indefinite (unless J = J or J = —J), in other words, there exist x, y © H 
for which [x, x] < 0 and [y, y] > 0. The space H, together with the sesquilinear 
form [:,- ] generated by some J, as just described, is called a Krein space. One can 
also define the Krein spaces intrinsically, by starting with a topological vector 
space and a continuous sesquilinear form on it, and by imposing suitable complete- 
ness and nondegeneracy axioms. 

The theory of operators in Krein spaces has been extensively developed in the 
recent half century, largely motivated by application in the physical sciences 
(description of physical processes governed by boundary value problems, partial 
differential equations, etc.). Recently, there is a renewed interest in Krein space 
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theory, mainly due to numerous new applications, especially in modern electrical 
engineering; for example, applications of finite dimensional Krein spaces to basic 
optimal worst case control problems in linear systems are described in [7]. Several 
books are dedicated to the theory of operators in Krein spaces and its applications 
[5, 6, 10, 14, 18]; the book [14] deals exclusively with finite-dimensional Krein 
spaces, and as such is more accessible as an introduction than the other four 
books. 

Here we concern ourselves with hereditary classes in Krein spaces. As in the 
Hilbert space case, such classes are defined using polynomial operator identities of 
type (1.3), where, of course, A* stands now for the Krein space adjoint: 


[ Ax, y] = [x, A*y | forall x,y € H. 


(Everywhere in this section the adjoint is understood in the Krein space sense.) 
The works in this area we are aware of deal mainly with dilation of isometries (the 
paper [12] and the extensive bibliography there are a good source concerning this 
theory); a possibility of developing some results on 3-selfadjoint operators in Krein 
spaces and their potential connections to the theory of Sturm-Liouville operators 
was indicated already in [15]. There does not exist even a partial theory of 
hereditary classes and families of Krein space operators, analogous in spirit to the 
basic results presented in Section 3. Development of such a theory appears to us a 
worthwhile research project. One should be warned, though: Many facts taken for 
granted in the Hilbert space situation are hopelessly lost in Krein spaces. For 
example, a selfadjoint operator in Krein space need not have real spectrum; a 
selfadjoint operator in finite-dimensional Krein space need not be diagonalizable. 

This section can also be regarded as a rather unusual introduction to some 
aspects of Krein spaces, which is motivated by the notion of 2-selfadjoint operators 
and related hereditary classes. Namely, we consider the class of operators A that 
Satisfy 


A? — 2 A*A + A*? =0, (5.1) 


(naturally, such A will be termed 2-selfadjoint operators), and the classes of 
operators A that satisfy the related identities 


A*A = A’; (5.2) 
AA* = A’; (5.3) 
A* =A _ (selfadjoint operators). (5.4) 


It turns out that the four classes of operators (defined by (5.1), (5.2), (5.3), and 
(5.4), respectively) are all distinct in the Krein space framework, in contrast with 
the Hilbert spaces (see Example 5.2 below). 

We assume from now on that the Krein space defined by J, as in the beginning 
of this section, is such that either J = J, or —1 € o(J) and —1 is an eigenvalue of 
J having finite multiplicity. Such Krein spaces are commonly called Pontryagin 
spaces. The total multiplicity of —1 as an eigenvalue of J is a finite number, called 
the defect of the Pontryagin space. When J = J, we formally define the defect to 
be zero. 

For an operator A on H, denote by A, (respectively, A,) the selfadjoint 
(respectively, skew-adjoint) part of A. These operators are uniquely defined by the 
properties that A, and A, are selfadjoint in the Pontryagin space sense, i.e., 


[Apgx,y] =[x, Agy],[A;x,y] =[x, Ay] 
for all x,y € H, and A =A, + iA,. The following result was proved in [20]. 
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Theorem 5.1. Assume H is a Pontryagin space with defect q, and let A be an operator 
on H such that 


A? — 2 A*A + A** = 0. (5.5) 
Then o(A,) = {0}, rank A, < 2q, and 
rank( A*A — AA*) < 2q-1 if q>0. (5.6) 


Theorem 3.1 is a particular case of Theorem 5.1 (where one takes q = 0). 
We illustrate Theorem 5.1, as well as some other aspects of the theory of Krein 
space operators, by the following example. 


Example 5.2. Let H = C* (with the standard Hilbert space structure), and let 


J= |: 1 | Note that H has defect one as a Pontryagin space. Consider 
1 0 0 
O p r 
A={0 0 qj, 
0 0 O 
where p, g, and r are complex numbers. Using the formula 
A* =J(A) J, 
where (.A)’ stands for the conjugate transpose of A, we obtain 
0 q r 
A*=|0 0 pil. 
0 0 0 


Thus, A = A* if and only if p = g and r is real. We see immediately that there 
exist selfadjoint nondiagonalizable matrices. Computations show that (5.2) holds if 
and only if g(p — g) = 0, (5.3) holds if and only if p(p — q) = 0, and (5.1) holds if 
and only if pq is real. Thus, in this example, all four classes defined by (5.1), (5.2), 
(5.3), and (5.4) are distinct. | 


Several interesting open problems present themselves naturally in the context of 
the material of this section. For example: 


Problem 5.3. Describe the set of operators in a finite dimensional Krein space that 
satisfy 

A*-2A*A + A**>=0, AA* = A? 
Problem 5.4. Determine the indecomposables (analogous to those in the Hilbert space 


situation, see Section 2) in the class of 2-selfadjoint operators in finite dimensional 
Krein spaces. 
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Major Centers of Triangles 


Clark Kimberling 


Not long ago, the Monthly carried an article with a most intriguing title: “The 
Rise, Fall, and Possible Transfiguration of Triangle Geometry: A Mini-history” [4]. 
Among the possible transfigurations mentioned was a recognition of triangle 
centers as functions instead of mere points. Here, we wish to pose certain 
problems, with partial solutions, for which one is forced into the functional mode 
—otherwise the problems do not make sense. 

To get started, consider the following: Let x( ABC) denote a procedure for 
constructing a center X of A ABC, such as its incenter J, and let yLABC) denote 
a procedure for finding another (perhaps the same) center Y of A ABC, such as 
its centroid G. For a given choice of X we can form the three triangles 
XBC, XCA, XAB, and apply the function y to each of these to find their centers: 


y(A XBC) = Y,, y(A AXC) = Y,, y(A ABX) = Yo. 
Let 
A’ = XY, BC, B’ = XY, 9 CA, C’ = XY. ON AB. 


Question A: For a given function x, what functions y yield a triangle A'B’C’ that is 
perspective to A ABC, i.e., is such that the lines AA’, BB’,CC’' concur in a point Z? 


It is easy to check that if x constructs the incenter and y constructs the 
centroid, then the resulting A A’B’C’ is perspective to A ABC, and Z is the 
centroid of A ABC. Similarly, if y constructs the circumcenter, then Z is just 
the original center, 7, as shown in Figure 1. 

One can turn these geometric procedures into analytic ones via trilinear 
coordinates and a general definition of “center.” We shall show that if x con- 
structs the incenter, the analysis results in an unexpectedly simple form for a 
particular class of centers called major centers. 


1. TRILINEARS. The position of any point P in the plane of A ABC is uniquely 
determined by any three numbers a, B,y that are proportional to the directed 
distances from P to sidelines BC,CA, AB. For example, the incenter, /, being 
equidistant from the sidelines, is given by a = B = y = 1, and we write J = 1:1:1. 
Similarly, 


centroid = csc A: csc B: csc C, 

circumcenter = cos A:cos B: cos C, 

orthocenter = sec A:sec B: sec C, 

nine-point center = cos(B — C): cos(C — A): cos(A — B). 
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Figure 1. (X = J and Y = circumcenter) = (A A’B'C’' perspective to A ABC). 


Three points P, = a,;: B,: y; are collinear if and only if 


a PB 
a, By Y¥2|=0. (1) 
az; Bz; Y3 


Replacing a,: B,: y, by a variable point a: 8: y in (1) gives an equation for the 
line P, P,. The dual to the collinearity expressed by (1) is the concurrence of three 
lines aa, + BB, + yy; = 0, 1 = 1,2,3. Extending the duality, the intersection of 
the last two of these lines is the point 


BoV3 — V2 B3: Y2%3 — A273: @2 Bs — Bra. (2) 


For more about trilinears, see the recent treatment by Coxeter [2] or Oldknow [9]. 
Boyer [1] traces the origin of trilinear coordinates back to M6bius. 

When trilinears for points are written in terms of the vertex angles A, B,C, the 
points may be regarded as functions of the variables A, B,C. We take this point of 
view and note three particular properties of traditional special points of a triangle: 
suppose a point P has trilinears 


f(A, B,C): g(A, B,C): h(A, B,C) 
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satisfying 


Gi) g(A, B,C) = f(B,C, A) and h(A, B,C) = f(C, A, B); 
Gi) fCA,C, B) = fCA, B,C); 
Gii) if P is written as u(a, b,c): u(b,c, a): u(c, a,b), where a,b,c are the 
sidelengths of A ABC, then u is homogeneous in the variables a, b,c. 
(By the law of sines and property (i), such a u must exist.) 


Then P is called (in [6] and [8]) a triangle center, or simply a center. It is easy to 
verify that the five special points already mentioned are centers. This is true also 
for points named after Lemoine, Gergonne, Nagel, Spieker, Feuerbach, Fermat, 
Napoleon, Euler, Steiner, Hofstadter, Parry, Yff, and others. We now define ¥ to 
be the set of centers and recast Question A as follows: 


Problem A: Given X = incenter, find all Y in ¥ for which A A'B'C’ is perspective 
to A ABC. 


In order to evaluate Y in the three triangles, we shall form trilinears for Y, 
relative to A XBC, for Y, relative to A AXC, and for Y, relative to A ABX, and 
then transform these to trilinears relative to the reference triangle ABC. 


2. COORDINATE TRANSFORMATIONS. Any three noncollinear points 

P. = f,(a,b,c): g,(a,b,c):h,(a,b,c): i=1,2,3, 
determine a triangle with vertices P,, P,, P,;. The triangle can be represented as a 
matrix 


fi(a,b,c) gi(a,b,c) h,(a, b,c) 
M=|f,(a,b,c) g (a,b,c) h,(a,b,c) |. 
f,(4,b,c) g3(4,b,c) h(a, b,c) 
Let F,,G,, H, denote the functions satisfying 
Fi(a,b,c) G(a,b,c) H,(a,b,c) 
M'= iM F,(a,b,c) G,(a,b,c) H,(a,b,c) }. 
F,(a,b,c) G,(a,b,c) H,(a,b,c) 


Let a’: B’: y’ be trilinears relative to M for a point Y. Then the matrix equation 


(a B y)=(a@' Bp’ y')DM (3) 
gives trilinears a: B: y relative to A ABC for the point Y, where 
5,D, 0 0) 
D= 0 5, D, 0 |, 
0 0 6,D, 


F? + F} + F? — 2F,F; cos A — 2F;F, cos B — 2F,F,cosC, 
D, = ¥Gi + G5 + Gi — 2G,G,cos A — 2G,;G, cos B — 2G,G, cosC, 
H) + H} + H? — 2H,H,cos A — 2H,H, cos B — 2H,H, cosC, 
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and 6, = 1 for i = 1,2,3. For a proof, see [6]. In order to say when 6, is —1 and 
when +1, we must first define the positive side of a sideline of a triangle as the set 
of points in the plane of the triangle that lie on the same side of the sideline as 
does the vertex that is not already on the sideline. The negative side is the other 
side, and the directed distance from a point to a sideline is positive or negative 
according to which side the point lies on. Now, the directed distance from Y to 
sideline P,P, is 6,h,/D,, where h, = aF, + BF, + yF;3, and 6, is determined as 
follows: if Y lies on the positive side of P,P, and h, < 0, then 6, = —1; if Y lies 
on the negative side of P,P, and h, > 0, then 6, = —1; otherwise, 6, = 1. The 
numbers 6, and 6, are defined analogously. 

Returning to Problem A, we eventually want to allow X to be points other than 
the incenter, so we write X = x: y: z. The relevant matrices are 


x y Zz 1/1 -y -z 
M=|0 1 0) and M™' = 510 x oO], 
0 0 1 0 0 x 
so that the transformation (3) can be written out as 
a: B:y =x6,D,a': y6,D,a' + 6,D, B': z6,D,a'’ + 6,D3y’. (4) 


This equation shows how to write trilinears a: B: y for Y, relative to A ABC in 
terms of trilinears a’: B': y’ for Y, given relative to A XBC. Of course, a’: B’: y’ 
must be given in terms of the sidelengths of A XBC, which in turn depend on the 
choice of point X. Perhaps the simplest case is X = incenter. 


3. MAJOR CENTERS AS SOLUTIONS OF PROBLEM A. For Problem A, we 
have X =1:1:1 and 6, = 6, = 6, = 1 in equation (4), and, with sidelengths of 
A XBC as labeled in Figure 2, we obtain from equation (4) the transformation 


a: B:y=a':a’+2B'cos(C/2): a’ + 2y' cos( B/2). 


It is now easy to see that lines XY, and BC meet in the point 


A' = 0: B' sec( B/2): y’ sec(C/2). 


B 


Figure 2. Incenter, X, inradius, r, and associated angles and lengths. 
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We wish to do the same thing relative to the other two interior triangles, AXC 
and ABX, obtaining points B’ and C’, then to find equations for the three lines 
AA’, BB’,CC’, and then to find a necessary and sufficient condition for these 
three lines to concur. Write Y, as a}: Bj: y;, and similarly, Y, = a3: Bj: y,; and 
Yo = a3: Bz: y3. There are simple equations for the three lines 


AA':  (—1/B{)cos( B/2)B + (1/y\)cos(C/2)y = 0, (5) 
BB': (1/a,)cos(A/2)a + (—-1/y3)cos(C/2)y = 0, and (6) 
CC’: (1/a3,)cos(A/2)a + (—1/B;)cos(B/2) B = 0. (7) 
Using equation (1), we conclude that these lines concur if and only if 
a3 Bly. = % B37, (8) 


where 
a,=a'(a,resc(C/2), rcsc(B/2)), By = a'(resc(C/2), rcsc( B/2), a), 
y, = a’(rcse(B/2),a,rcsc(C/2)), 
a,=a'(rese(C/2),b,resc(A/2)), B, = a’(b,rcsc(A/2), rcesc(C/2)), 
¥, = a'(resc(A/2),rcsc(C/2), b), 
ax,=a'(resc(B/2),rcsc(A/2),c), B; = a'(resc(A/2),c,r,csc( B/2)), 
y3 = a'(c,rcsc( B/2),rcsc(A/2)), 


and r denotes the inradius of A ABC. These equations are amenable if Y is a 
center for which there exist trilinears a: B: y such that a is a function of A alone; 
that is, a = f(A). We call such a center Y a major center. 


Theorem. Every major center Y solves Problem A. If Y = f(A): f(B): f(C), then the 
point of concurrence is the major center given by 


f(A/2)sec(A/2): f( B/2)sec( B/2): f( A/2)sec(C/2). (9) 
Proof: Given Y = f(A): f(B): f(C), we find Y,, Y,, Yo given by 
ao BY f((a — A)/2) f( B/2) f(C/2) 
a By Ya}= |  f(A4/2) f((a — B)/2) f(C/2) |. (10) 
OM a f( 472) f( B/2) f((a — C)/2) 


Then equation (8) holds, and equations (5) to (7) together with (2) imply 


cos( B/2)cos(C/2) cos(C /2)cos( A /2) cos( A /2)cos( B/2) 
BB! CC! = =: > soe 
¥2 B3 Y2%3 a7 Bs 
so that (10) leads directly to (9). a 


For example, if Y = sin A: sin B: sin C (the symmedian point, or Lemoine point), 
then 


a, Boy cos(A/2) sin(B/2)  sin(C/2) 
ay By Y| = |sin(A/2) cos(B/2)  sin(C/2) |, 
a, By 3 sin(A/2) sin(B/2) cos(C/2) 


and BB’ \ CC’ = —(cos(B/2)cos(C/2))/(sin (C /2) cos (B/2)), so that the point 
of concurrence (9) is tan(.A /2): tan(B /2): tan(C /2). 
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It can be confirmed by Mathematica using trilinears that certain centers other 
than major centers also solve Problem A. Among them are the center of the 
Kiepert hyperbola, given by a = sin A sin?(B — C), and the center of the Jerabek 
hyperbola, given by a = cos A sin*(B — C). The Kiepert hyperbola was recently 
resurrected in [4], and the Jerabek, in [10]. One wonders if there is a nice way to 
write the general solution of Problem A. 


4. MAJOR CENTERS OBTAINED AS LIMITS. Let FCA) denote the infinite 
product 
sec( A /2)sec( A /4)sec( A /8)sec( A /16)sec( A/32) ---, 
and let Z = F(A): F(B): F(C). It is a charming little calculus problem to prove 
that the product converges for all real numbers A to the limit F(A) = sin(2.A)/2 A, 
so that Z is the center 
sin(2.A) /A:sin(2B) /B:sin(2C) /C, 

of interest because the vertex angles A, B,C appear in the denominators without 
being the arguments of trigonometric functions. These angles also appear “ex- 
posed” in another, quite different manner, in which we do not need to know that 
F(A) = sin(2.A)/2.A. Starting with any major center Y = f(A): f(B): f(C), iterate 
(9) to obtain the center 


Jim (f( 4/2") F(A): f(B/2")F(B): f(C/2")F(C)). (11) 


Suppose now that Y = centroid, so that f = cosecant. Then the limit (11) is the 
center 
(1/4) F(A): (1/B)F(B):(1/C)F(C). (12) 

At this point, we recall a kind of conjugate that pervades triangle geometry. 
Suppose P = x: y: z is a point not on a sideline of A ABC. The reflections of lines 
AP, BP,CP about the interior bisectors of angles A, B,C respectively, concur in 
the isogonal conjugate of P, given by trilinears 1/x: 1/y: 1/z. 

Thus, the center in (12) has isogonal conjugate A /F(A): B/F(B): C/F(C). Let 
g(A) = A/F(A), and using g instead of f, iterate (9) again, obtaining as a limit 
the center 1/4:1/B:1/C, whose isogonal conjugate is the bare-angle center 
A: B: C. Thus, we have a “construction” of an extremely easy-to-represent “tran- 
scendental” center. This mysterious center is mentioned in connection with other 
transcendental centers conceived a few years ago by Douglas Hofstadter [7]. 


5, ANOTHER PROBLEM. Returning to Question A, suppose X need not be the 
incenter but is free to range about the interior of A ABC. Further, suppose Y 
represents an arbitrary center, and as before, 

Y, = Y(A XBC), Y,; =Y(A AXC), Yo=Y(A ABX), 

A’ =XY,N BC, B'=XY,NCA, C’=XY, NAB. 


It is easy to verify that certain major centers solve Problem B, namely those given 
by trilinears sin? A:sin? B: sin? C. As p ranges through the real numbers, these 
points form the power curve. See [8], [5], and Figure 3. 
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Figure 3. Nineteen points on the power curve, including centroid (p = —1), incenter (p = 0), and 
symmedian point (p = 1). 


Next, suppose in the statement of Problem B that Y is a point (the function 
kind of point) but not necessarily a center. Would concurrence of the lines 
AA’, BB',CC’ for all points X then force Y to be a center? The answer to this 
question remains to be found. 


6. ANOTHER KIND OF CONJUGATE? We return once again to Problem A. 
This time, we fix X = orthocenter = sec A: sec B: sec C, so that 


sec A secB secC 1 -secB -—secC 
0) 1 0) 0) sec A 0) 


0 ) 1 0 0 sec A 


The transformation [a: B: ylisc =[a@’: B’: y'lygc-DM can be written out as in 
(4). Here, however, in order to keep 6, = 6, = 6; = 1, we shall confine our 
attention to acute triangles only! 

It is tedious but not difficult to verify that if Y = f(A): f(B): f(C) then the 
three lines AA’, BB’,CC’ concur if and only if 


nd M!'= 
sec? A 


0 f(Ce BI fC Be)[C] 
f(C4)[A] 0 -f(Ac)IC]} = 9, (13) 
F(BalA] —f(Az)[ 8] 0 


where [A] = Vsec* B + sec* C + 2sec Bsec Ccos A, and [B] and [C] are ob- 


tained from [A] by the usual cyclic permutations of A, B,C. Obviously, (13) holds, 
since Cp, =Az, C, = B,, and A, = Bo. See Figure 4. Using (2), the point of 
concurrence is found to be 


BB' 1 AA’ = f( Ac) f( As LBILC]: flAc) A BaLCILAT: FAs) A(C)LAIB] 


1 1 1 
~ f(B)TAT f(A, [BT FBC 
csc 2.A csc 2B csc 2C 


~ f(a /2— A) f(m/2 — B) f(4/2—C) 
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B ccos B b cos C C 


Figure 4. Orthocenter, H, and associated angles and lengths: 
4 BHC = 7—A, 2CHA = 7—- B, 2 AHB = 7 —- C, 
Cz =Apz, Cy = B,, Ac = Be. 


We write this point of concurrence as Y ~ . It is easy to see that for any major 
center Y, we have (Y~)* = Y. Perhaps Y* could be called the orthoconjugate of 
Y if someone could find a satisfactory way to extend ( )* to obtuse triangles. 


7. CONCLUSION. We wish to emphasize the main point of this paper by return- 
ing to Question A. For any given procedure x (i.e., center X), if you attempt to 
carry out the construction, you are forced to treat Y not as a mere point—that’s 
the traditional understanding of a triangle center—but rather as a function, here 
applied to three different triangles. This distinction between point and function 
really does make a difference, since, as we have seen, Problems A and B have 
solutions that can be understood only as functions. 
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On Lambert’s proof of the irrationality of 7 


M. Laczkovich 


The irrationality of 7 was first proved by J. H. Lambert in 1761 in his paper [3] 
(reprinted in [4, pp. 112-159]). Lambert’s argument is the following. First he 
proves the formula 


tanx = ————__, (1) 


5-— +. 


Then Lambert shows, by an argument of infinite descent, that if x # 0 is rational 
then the right hand side of (1) is irrational. Since tan(zr/4) = 1 is rational, this 
implies that 7 is irrational. 

Lambert’s proof is seldom reproduced in books on number theory. The reason is 
clear: a rigorous proof of (1) cannot avoid questions of convergence of continued 
fractions, and if our aim is just to prove the irrationality of 7 then this digression 
is not worth the effort. (The last monograph that gives Lambert’s argument in 
detail seems to be Chrystal’s Algebra [1].) The “usual” proofs avoid continued 
fractions and use variants of Hermite’s idea: if 7 were rational then certain sums 
or integrals would be integers, contradicting estimates showing that the actual 
values lie in (0, 1); see Niven’s book [5]. In the notes on Chapter 2, Niven also gives 
a list of papers following this line. J. Popken published several papers on the 
subject that contain variants of Hermite’s argument [6], [7], [8]. The paper [9] is 
different; here Popken reproduces Lambert’s computation and infers, in a particu- 
larly simple way, Lambert’s theorem: if x # 0 is rational then tan is irrational. 

In this paper we further simplify [9] by replacing its computations with Gauss’ 
functional equation. This gives a very simple proof of the irrationality of tanx (and 
also of f(x) for a wide class of other functions) whenever x # 0 is rational. The 
irrationality of 7 follows. We also give a self-contained proof of (1) using the same 
device. 

Lambert’s original computation leading to (1) was somewhat tedious; he divided 
the power series of sinx by that of cosx using a version of Euclid’s algorithm, and 
determined the quotients and the remainders. This computation was simplified by 
Gauss [2], who determined the continued fraction expansions of the hypergeomet- 
ric series using their functional equations. If we want to prove only (1), then, 
following Gauss’ argument, we may restrict our attention to the one-parameter 
family 


2 x4 x 


f(x) = 1 7 + k(k+1)-2! k(k+1)(k+2)°3! 


It is easy to see that the series defining f, converges for every x and for every 


1997] ON LAMBERT’S PROOF OF THE IRRATIONALITY OF 7 439 


k #0, —-1, -2,.... A simple computation shows that 
ifk =1/2then k(k + 1)-°-(kK+n—1)-n!= (2n)!/4’, 
and 
ifk =3/2then k(k + 1)+-(kK +n —1)-n!= (2n + 1)!/4’. 
Therefore we have 


sin(2x ) 
fiyr(*) = cos(2x) and fs 2(*) ~ 54 


for every x. It is also easy to check, by comparing the coefficients of x*”, that 


2 


K(k + 1 lee) = frai(*) —f,(*) (2) 


for every x and for every k #0, —1 ,—2,.... In the proof of the following 
theorem we combine (2) with the argument of [9]. 


Theorem 1. Jf x #0 and x’ is rational, then f,(x) #0 and f,,.(x)/f,(x) is 
irrational for every k = Q, k # 0, —1, —2,.... 


Proof: First we show that 
lim f,(x) = 1. (3) 
roo 


Indeed, since x?"/n! > 0 as n > ©, there is some K > 0 such that |x?" /n!| <K 
for every n. Therefore, if r > 1, then | f,(x) — 1|< 0%_,K/r" = K/(r — 0), from 
which (3) follows. 

Let x be a nonzero real number such that x’ is rational, let k <Q, k # 
0,—1, —2,... be fixed, and suppose that f,(x) = 0 or f,,,(x)/f,(%) is rational. 
Then f,(x) and f,,,(x) are both integer multiples of the same quantity: say 
f(x) = ay and f,.,(x) = by for integers a and b. We allow a or b to be zero. But 
y cannot be zero, since it would then follow from (2) that f,,,(x) = 0 for every 
n = 1,2,..., which would contradict (3). 

Let g be a positive integer such that (bq/k), (kq/x”), and (q/x’) are all 
integers. Now let G, = f,(x) and 


n 


q 
Gy = K(k + 1) (kan lernl) (n = 1,2,...) 


for each n = 1,2,.... Then G, = ay, G, = (bq/k)y, and from (2) we can calcu- 
late that 


kq q q’ 
Gi+2 = (4 + Son)Gy Se, (4) 


for every n = 0,1,.... The coefficients in (4) are integers, so G,, is an integer 
multiple of y for every n. Since f,,,,(%) > 1 by G) and q”’/[k(kK + I) (kK +n - 
1)] — 0, we have G,, > 0. But f,,,(x) > 1 also implies that G,, is positive for all 
sufficiently large n. Positive integer multiples of y cannot converge to zero. The 
contradiction means that f,(x) and f,.,,(x) cannot both be integer multiples of the 
same quantity. | 
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Corollary 2. 7” is irrational. 
Proof: f,;.(a/4) = cos(a/2) = 0. a 
Corollary 3. Jf x # 0 is rational, then tanx is irrational. 


Proof: Since (x/2)’ is nonzero and rational, f;,.(x/2)/f, jhX/2) = (tanx)/x is 
irrational, and then so is tanx. a 


Although we eliminated (1) from the proof, for the sake of completeness we give 
a simple and self-contained proof of (1) using (2). We prove that (1) holds for every 
complex number x. The continued fraction 


b 
by 
1+ T+ 
1 By -1 
+ 
1+ 5, 
will be denoted by [b,,...,5,]. Since occasionally we may have to divide by zero, 


we add to the set C of complex numbers an infinite element ~, and adopt the 
following conventions: (i) a/0 = © (a € C, a # 0); (ii) a/~ = 0, (a € C); and (iii) 
ato=q—o=w(qa €C). It is easy to see, using induction on n, that 


if |b,| < 1/4 for every i=1,..., — land if |b,| < 1/2, 
then |[b,,...,5,]| < 1/2. (5) 

We show next that if |b,| < 1/4 for every i = 1,...,n and if |6| < 1/4, then 
I[b1,---5D,_1,b, + 8] — [by,...5B,]| <6]. (6) 


2“n-19?“n 
This is clearly true for n = 1. Let n > 1, and suppose (6) is true for n — 1. Let 
lb|<1/4 G@=1,...,n) and |6|< 1/4. Denoting u =[b,,...,5,] and v= 
[b,,...,b,_,,b, + 6], we have |ul,|v| < 1/2 by (5), and |v — u| <|6| by the 
induction hypothesis. Then 

b, by 

1+uv 1+u 
7 (1+ u)(1 +0) 


, (1/4)16| 
© (1 - (172))(1 — (172)) 


[B15 ++» Pn-1s By + 5 | ~ [b,,...,5,]| -| 


= [6], 


which completes the proof. 
Now let x #0 be fixed. Let k=1/2, and put a, = fi, 4¢3;2)O)/firscy() 
(n = 0,1,...). By (2) we have 
1 


n x? 


'~ (a+ /a)(n + (3/2) 0" 
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Since a, = tan(2x)/(2x), this implies 


2 x2 


tan(2x)/(2x) = h G2) -G/D’ G/D)-G/D"” 


x2 
— Te, 
(n — (1/2))(n + (172)) 
for every n. Replacing x by x/2, and multiplying by x, we obtain 
x7 x? x? | 


tanx = », —-—_, -— 


r3°3-5° Qn-)anth™ ”) 


Let N > 1 be such that x*/((2n — 1)Qn + 1)) < 1/4 and a, € (0,2) for every 
n > N (recall that lim, ,..a, = 1 by (3)). Let 


x? x? x? 
” - (2N+1)(2N+3)’ (2N+3)(2N4+5)’ 9 (2n—1(2n +1) 
and 
x? x? x? 
on = - (2N+1)(2N+3)’ (2N+3)(2N4+5)’ (2n—1)(2n+1) , 
for every n > N. Then (6) ensures that |P, — Q,| < la, — 1| for every n > N. Let 
2 x2 x? 
Fy(y) =|x,-—~,--—,...., -——— 9}. 
WO) = 1% ~ 77397 B75 (2N—1)(2N+1)’” 


It is easy to check that Fy (as a function of y) is a homeomorphism of C = C U {>} 
onto itself; in fact, Fy is a fractional linear transformation. Since tanx = Fy(Q,,) 
by (7), we have Q,, = Fy ‘(tanx) for every n > N. Since |P, — Q,| < la, — 1] > 0 
as n — ©, this implies lim, ,,,P, = Fy '(tanx), and hence, by the continuity of Fy, , 
we obtain tanx = lim, _,..F,(P,,). However, 


X def 
Fy(P,) = = R,,. 
x 
1 — 2 
3 _ 
2 
2n—1)- 
Cn) aI 
Since the right hand side of (1) is defined as lim, _,,,R,,, this proves (1). a 


Note that (1) is valid for every x € C, even for x = (77/2) + kar, when tanx is 
to be interpreted as © . The conventions concerning © may be needed for other 
values of x, too, in order to compute some of the “convergents” R,. (Take, for 
example, x = /3.) However, for every given x, R, can be computed using finite 
numbers only, if n is large enough. Indeed, it is easy to see that there is a finite set 
S (depending on x) such that for every y ¢ S, the computation of F,(y) does not 


442 ON LAMBERT’S PROOF OF THE IRRATIONALITY OF 77 [May 


involve ~, Since lim, _,,.a, = 1, it follows from (2) that a, # 1 if n > ny. From this 
one can prove that P, # Q, for nm > 17), and hence every number occurs in the 
sequence P, only a finite number of times. This implies that for n > n, we have 
P,, € S, and then the computation of R, = F,(P,,) needs finite numbers only. 
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... he talked of a learned monk of many centuries ago, who did hit 
upon a way of multiplying numbers. That in itself I might 
understand, for it was simple, but the adding of each last two 
figures to make the next. To wit, one, two, three, five, eight, 
thirteen, onc and twenty, and thus forward as you may will. Mr B. 
averred that he himself did belicve these numbers appeared, 
though secretly, in many places in nature, as it were a divine cipher 
that all living things must copy, for that the ratio between its 
successive numbcrs was that also of a secret of the Greeks, who did 
discover a perfect proportion, I believe he said it to be of one to 
one and six tenths. He pointed to all that chanced about us, and 
said that these numbers might be read therein; and cited other 
examples, that I forget now except that many accorded with the 
order of petals and Icaves in trees and herbs, I know not what. 


John Fowles, A Maggot, New American Library, 1985 
Contributed by Irving Adicr, North Bennington, VT 
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NOTES 


Edited by Jimmie D. Lawson 


A Simple Congruence modulo p 


Winfried Kohnen 


Congruences for prime numbers p have always been of great interest. Examples 
include Fermat’s Little Theorem (n? =n (mod p)) or Wilson’s theorem 
((p — 1)!= —1 (mod p)). In the following we consider the congruence relation 
modulo p extended to the ring of rational numbers with denominators not divisible 
by p. For such fractions m/n = r/s (mod p) if and only if ms = nr (mod p), and 
the residue class of m/n is the residue class of m times the inverse of the residue 
class of n in Z,. 
The purpose of this note is to state and prove the following result. 


Theorem. Let p be an odd prime. Then 


1 (p-1)/2 (-1)""' 


p-l 
yy 
gay K° 2" k=1 k 


= ~—— (mod p). (1) 


Proof: First note the identity 


N 1 . N 1 
bg xy ys YN \(x*- 1)(NEN,xER). (2) 


Indeed, the derivative of the left-hand side of (2) is 


a pr t= G=x)"_ =x)" 
A aay Ee 


while the derivative of the right-hand side is 


yc (exe. 


Hence the derivative of both sides are equal. Also (2) is true for X = 1. 
In (2) we set N =p —1and X = —1. From p — k = —k (mod p), we deduce 


[7 "\- (p— 1) (pk) 


A 7 = (-1)* (mod p) 
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and thus equation (2) simplifies to 


(mod p). (3) 


In the sum on the left we replace k by p — k = —k (mod p) and use Fermat’s 
Little Theorem to obtain 


p-l Qk p-l 


Lye7¥ Epes 2 


1 
——- (mod p). 
— k+2* 


The sum on the right of (3) we rewrite as 


(p—-1)/2 (—1)" (p-1)/2 (-1)?“ (p~1)/2 ( — ay 
+ a2 5 (mod p). 
k=1 k k=-1 DTK k=1 
This proves (1). 
In the literature, congruences of a type similar to (1) can be found; however, in 
general they are of a much deeper nature. For example, in [1] with the help of 
properties of the Pell sequence (1 + V2)”), -, it is shown that 


(p-V/2 4 [3p/4] (-1)*° 


=» 


>) a 
pay 2" k=1 k 


(mod p). (4) 


It seems unlikely that (4) can be proved with the simple approach we have used 
here. 
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A Geometrical Method for Finding 
an Explicit Formula for the 
Greatest Common Divisor 


Marcelo Polezzi 


This note presents an explicit formula for the greatest common divisor (g.c.d.) of 
two integers derived using a simple geometrical argument. 

In [1], chapter 3, an expression was deduced, from which one can easily obtain a 
formula for the g.c.d. as a particular case. However, the derivation of that 
expression is very tiring and lengthy. 
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Here is the result to be proved: 


Theorem. Let m and n be positive integers. Then g.c.d.(m,n) = 2573" | i | + 
(m +n) — mn, where |x| is the greatest integer less than or equal to x. m 


Proof: Consider the triangle M = {(x,y) ER*|x>0;y>0;y< - x + n\. 
We have 


m—1 
#(MOZ*)= YY |-i— +n] + (m+n +1) 
i=1 
m—1 n m—-1 n 
> [(m —iy—| + (m+n +1) = y i— | + m+n +0), (1) 

i=l m j=) | ™ 
On the other hand, by considering the triangle as half of a rectangle, we obtain 
(m+1)(n +1) + (d+ 1) 


#(M 01 2’) 5 (2) 
where d = g.c.d.(m,n), since the number of lattice points on the hypotenuse is 
equal to (d + 1). In fact, let y, = —(m/m)x; + n. The set of integers x; between 0 


and m such that y, is an integer is 


0 ” 2 d—1 ” 
Hence, equating (1) and (2), we have 
ml) n m+1)(n+1)+(d+1 
[et [a mene y= EDO ADD 
iy lm , 2 
Therefore, d = 2075! |i—| + (m +n) — mn. 


Remark. The formula clearly holds for n = 0. 
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THE EVOLUTION OF... 


Edited by Abe Shenitzer 
Mathematics, York University, North York, Ontario M3J 1P3, Canada 


On the Historical Development 
of Infinitesimal Mathematics 
translated by Abe Shenitzer with the editorial assistance of Hardy Grant 


Detlef Laugwitz 


PART I. THE ALGORITHMIC THINKING OF LEIBNIZ AND EULER 


1, LEIBNIZ AND L’HOSPITAL. Exactly 300 years ago there appeared in Paris 
the first book on differential calculus, Analyse des infiniment petits, by Marquis 
Guillaume F.A. L’Hospital (1661-1704). The book was based on materials supplied 
by Johann Bernoulli (1667-1748). Johann and his older brother Jakob (1654-1705) 
were the first ““comrades-in-arms” of Gottfried Wilhelm Leibniz (1646-1716), one 
of the two discoverers of the calculus. In 1684 Leibniz published for the first time a 
few simple applications of the differential calculus, without making any attempt to 
provide clear justifications for them. But we find such attempts in his letters to 
judicious contemporaries. For example, on 30 March 1690, Leibniz wrote to John 
Wallis (1616-1703): “It is useful to regard quantities as infinitely small, so that, 
when their ratio is sought, they are omitted, rather than viewed as 0, when they 
turn up next to quantities that are incomparably larger. If we have x + dx, then dx 
is omitted. Similarly, we cannot let x dx and (dx)* stand next to one another. Thus 
if we have to differentiate xy, we write (x + dx)(y + dy) — xy =xdy + ydx + dxdy. 
But here dxdy should be omitted, for it is incomparably smaller than xdy + yd. 
Hence, in each particular case, the error is smaller than any finite quantity.” 
(Leibniz, Math. Schriften IV, 63; our translation. D.L., A.S.) 

Leibniz proceeds pragmatically. He states a rule which gives a correct result in 
‘every special case.’ At first sight, it all looks like facts based on experience, but 
theoretical justifications are in the offing. Leibniz seems to add to the system of 
quantities apparently measured in terms of real numbers ideal elements dx, dy,... 
such that a (positive).dx is smaller than every (positive) real quantity. Then the 
algebraic rules of computation are applied in the usual manner. The worked 
examples show that at the end of a computation the differentials are replaced by 
Os. (In the last step we are, to some extent, anticipating history, for it was soon 
noticed that the stipulation pertaining to the omitting of a quantity against an 
incomparably greater one can be problematic. A case in point is (x + dx)* — x? = 
2x-dx + (dx)* for x = 0!). 

This recipe can be used for algebraic expressions. As early as 1684 Leibniz 
obtained the root formula 


dy 
WY WO Ve = 
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which reduces to dy = dy/ 2/y . The latter is justified because dy may be 
omitted against y. 

At the time when Bernoulli and L’Hospital were working on the latter’s book, 
the calculus could already boast of significant achievements in geometry and in 
mechanics, as well as in variational problems. The latter is certainly not beginners’ 
stuff. A method for the solution of problems had been found, a method of analysis 
in the original sense of the word. The subsequent designation of the differential 
and integral calculus as “analysis” is derived from the title of the first relevant 
book. L’Hospital tried at first to present the subject synthetically, i.e., to begin with 
a statement of principles (axioms and postulates) followed by deductions. What 
Leibniz expressed in words was postulated in the book in the form of the equation 
y + dy =y. This produced shock, for in a computation dy was not supposed to be 
initially put equal to 0. 

As a philosopher, Leibniz gave a great deal of thought to equality and equiva- 
lence relations, but there is no trace of this in his calculus. Now, in nonstandard 
analysis, we write a = b for equality proper and = for “almost-equality,” i.e., we 
write a = b if a — Db is infinitely small. Moreover, we carefully differentiate R, the 
system of real numbers, from the larger system that includes, in addition, the 
infinitely small numbers. In our set-based symbolism, which is just as ahistorical as 
the symbol = , we denote the latter system by *R and speak of the hyperreal 
numbers. A genuine historian may be horrified, but so long as we are aware that 
we are using our richer professional language, and that in some cases this language 
does not quite reflect the historical state of affairs, not much harm is done. Every 
historian must face up to the fact that the language of the sources is less, or 
differently, expressive than the corresponding modern professional language. 

Using our language, it is possible to state the recipe of Leibniz and L’Hospital 
very clearly: We compute in *R. If at the end of a calculation there appears an 
almost-equality a = b and a is almost-equal to a real number, then b must also be 
almost-equal to this real number. Those who want to be really fancy, can speak of 
a homomorphism st (= standard part of) from a part of *R to R. The reader can 
easily verify that st(a + b) = sta + st b, st(a-b) = (st a)(st b), provided that all 
standard parts exist. That this is not always the case is shown by the example of 
1/dx. 

There is something else we are about to modernize. From Leibniz’s letter to 
Wallis we know Leibniz’s proof of the product formula: He writes an equation in 
terms of differentials, and if we go over in it to standard parts, then we obtain 

= ( and are not much wiser. This is why we prefer to use quotients of 
differentials (this happened soon after) and view x and y in the derivation as 
functions of ¢. Then 


dx = x(t + dt) — x(t), etc., 
and 


d( xy) 1 dy dx dx 
7 7 glet Moto) -wl=x7 + Zyt Zw 
The derivative function was accepted only after 1800. We use it immediately: If for 
a fixed ¢ =f, all differential quotients dx/dt have the same standard part 
regardless of the choice of the infinitely small dt, then this standard part is called 
the derivative x’(t,); the function x(t) is said to be differentiable at t). After going 
over to standard parts, we read off from our formula the following result: If x and 
y are differentiable, then so is x-y and we have the product formula (xy)’ = xy’ + 


x'y. 
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Unfortunately, since Leibniz’s time everything has become a bit more compli- 
cated, but this is the price we must pay for precision. In general, the differential 
quotient is a hyperreal number, while the derivative is always real for real 
numbers. We leave it to the reader to prove the quotient formula and the chain 
rule. By now we have justified the most important part of the differential calculus. 
For the time being, we leave over integration and turn to Euler and his ‘algebraic 
analysis.’ 


2. LEONHARD EULER AND THE ANALYSIS OF THE INFINITE. Euler was 
born in 1707 in Basel. He learned the newest mathematics from Johann Bernoulli. 
This was as natural for him as learning the multiplication table is for others. At the 
age of 20 he left Basel never to return. He worked in St. Petersburg, from 1741 to 
1766 in Berlin, and until the end of his life in 1783 again in St. Petersburg. He was 
the greatest mathematician of the 18th century in all areas, including applications. 
The recent little book by E. Fellmann, head of the Basel Euler-Archiv, is a must. 
The book that appeared on the 200th anniversary of Euler’s death is also valuable 
(Fellmann 1995, Euler 1983). We are about to present small excerpts from Euler’s 
monumental work, chosen to illustrate the manner of thinking that prevailed in his 
time. 

Leibniz and the Bernoullis used infinitely small numbers in the differential 
calculus, but they also made occasional use of infinitely large numbers. It was quite 
natural for Euler to work with infinite natural numbers and with infinite series 
including summands with infinite index. His paper on harmonic series (De progres- 
sionibus harmonicis observationes, Euler Op. ser. I vol. 14, 73-86), written in 1734, 
is highly recommended. It was common knowledge that integration of the hyper- 
bolic function y = 1/x yields the natural logarithm. If we form upper and lower 
sums that extend to integral division points then, like Euler, we obtain 


and for infinitely large N we always have C, = C = 0.577..., where C is the 
so-called Euler constant. This implies that 


2N (—1)"*! 1 1 1 1 
y Dy ty tty — 
a on 2 3 4 2N-1 2N 
1 1 1 1 1 
=1+>-+7>-+-+- + — 
2 3 4 2N-1 2N 
(2 1 1 
—-+—+---+4+—— 
2 4 2N 
2N 1 N 1 
= —-— )) — =log2N + Cyy — log N — Cy 
n=1 7b n=1 


U 


2N 
log2N — log N = log Wt log 2. 
In much the same way Euler evaluated certain previously unknown sums. We 


suggest that the reader should try to obtain Euler’s series for log3 = log3N — 
log N. 
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It is not difficult to verify ‘rigorously’ these sums of series by considering finite 
partial sums and passing to the limit. But for Euler there were no limit considera- 
tions, and what we want to know is the kind of reasoning employed by him and by 
his contemporaries. This being so, it would be counterproductive to handle these 
problems using perfected later means. 

According to a view that stubbornly persists in the secondary literature, Euler 
gave no thought to convergence. And yet his early work devoted to harmonic series 
contains a convergence criterion; more precisely, a condition for a series to have a 
finite sum. When a series has a finite sum, what is added after an infinite term is 
actually infinitely small. Conversely, if what is obtained from the continuation 
beyond the most infinite term were finite, then the sum of the series would 
necessarily be infinite. (The last condition holds only for series with nonnegative 
terms.) Stated as a formula, this criterion asserts that Xa, is finite (the series 
converges) if and only if for N infinitely large we always have L,,. ya, = 0. 

Euler was never particularly interested in general conceptual arguments but, as 
we see here, they were not foreign to him. He was far more interested in 
algorithmic applications. In the case of the general harmonic series Ln”, for 
0 < p <1, he finds that 


> 1 N 1 Ee 
— > ee —P. 
na. 1” (2N)* 2P 


Since this is infinitely large, the series diverges. For p > 1 Euler concludes that the 
series converges. All this had been known for a long time on the basis of, say, an 
appropriate integral test. If Euler re-proves these facts, he does so, presumably, to 
make use of his convergence criterion which, unlike the integral test, can also be 
applied to alternating series. For further details see my book of 1986. 

Euler’s original papers were not easily accessible. They appeared in obscure 
publications, mostly in Latin, and on rare occasions in French. Hence what 
influenced larger circles of readers was his textbooks, which soon appeared in 
German translations provided with (consistently bad) commentaries. It seems that 
already at that time not all the interested readers had a sufficient knowledge of 
Latin! For a long time the image of Euler as a mathematician was a reflection of 
these elementary books rather than of his deeper original works. We must keep 
this in kind, now that we are about to turn to his /ntroductio in analysin infinitorum 
(published in 1748). Today, H. Maser’s modern translation is readily available as a 
reprint and includes a usable commentary. 

The literal translation of the Latin title is, of course, An Introduction into the 
Analysis of the Infinite, but a more fitting titlke would be An Introduction to the 
Solution of Probleris by Means of Infinite [numbers and processes]. Euler alge- 
braized infinite series, products, and continued fractions. He tested the possibili- 
ties resulting from the formal extension of algebraic calculations with finite 
formulas to calculations with infinite ones. In particular, he treated power series as 
polynomials of infinite degree. His approach to the exponential function was 
typical. He defined (!) it as 


x\N 
e=(1+5)] 
N 


with N an infinite natural number. He made use of the binomial formula and 
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obtained 


ey ENN AD ENN TDN) 


LEN Nt or Ww? 3! N3 
wieet(1-5)5+(1-g)lt-a)ate 
N } 2! N N} 3! 
At this point things get precarious. Presumably, since 1/N = 2/N = -:- = 0, we 
obtain the power series expansion 
xx x 
ealtxt state 


Euler was well aware, and said it elsewhere, that one couldn’t simply claim that the 
sum of infinitely many small errors is 0. After all, an integral is the sum of 
infinitely many infinitely small rectangular areas. Euler could easily prove the 
convergence of this series for real and even for complex values by means of his 
convergence criterion (which he doesn’t mention in the book). All that is needed is 
to majorize the remainder of the series at infinity by means of a geometric 
progression. Its partial sums of finite length are ultimately arbitrarily close to its 
sum. Since the partial sums of the same length in the binomial expansion differ 
from them by arbitrarily little, the asserted convergence follows. In any case, one 
should write e* ~ (1 +x/N)”. Incidentally, the argument shows that there is no 
need to choose a special infinitely large natural number N. 

If y =e*, then x = log y. But y = (1 + x/N)” implies that x = N(y’/% — 1). 
Since, in general, an N-th root has N complex values, one would expect the 
logarithm to be infinitely many-valued. This is how, in the algebraic style of his 
Introductio, Euler settled in 1749 the dispute that animated his two great predeces- 
sors in the years 1712-1713. Their exchange of letters was published in 1745. The 
title of Euler’s essay is De la controverse entre MM. Leibniz et Bernoulli sur les 
logarithmes des nombres negatifs et imaginaires. (Euler Op. ser. I, vol. 17.) 

The essay is an expository masterpiece. It can be read by university and even 
high school students. It was written in the language of the Berlin court and seems 
to have been intended for a wide readership. This is indicated by the thorough 
discussion of details. 

Euler comments on the arguments of the two opponents. Bernoulli thought that 
log(—1) = 0. This can be justified on the basis of the functional equation: 
0 = log1 = log(—1)* = 2log(—1). Leibniz argued that since the exponential func- 
tion is positive for all real inputs, the logarithm of —1 cannot be real. Both 
adduced further justifications for their respective positions. The issue remained 
unresolved until Euler produced his surprising solution and actually computed 
explicitly the infinitely many values of the logarithm. For example, take y = 1. By 
the de Moivre formula we have 


272 _ 217g 


1/N _ . 
= cos —— +1°:S1N 
N 


y 


for every integer g. For a finite g we have N (cosQQ7g/N)—-—1)=0 and 
NsinQag/N) = 27g, so that log1 = 27mg for every integer g. More generally, 
for every complex z # 0 Euler obtained the values log z = logr + i(a + 27g), 
where z = r(cos a + isin @). 
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The knowledgeable reader will notice that Euler could have obtained the same 
result more simply by using his famous formula e’* = cos a + i- sin a and apply- 
ing the functional equation of the logarithm to z = r-e'* = r-e%*+*78), He had 
discovered this formula a few years earlier, but it was not yet universally accepted. 
And not even his algebraic solution of the controversy convinced all his contempo- 
raries; d’Alembert defended Bernoulli’s position as late as 1761. 

What made this question so important? Why did the greatest minds persistently 
return to it? A key problem at the beginning of the century was indefinite 
integration. In the case of differentiation one had an algorithm which when 
applied to, say, rational functions again yielded rational functions. But the inverse 
operation, integration, was not so simple. In 1702 Leibniz and Johann Bernoulli 
independently investigated the integration of rational functions through decompo- 
sition into partial fractions. For example, 


1 1 1 1 
x?>-a? 2a\x-a xta 
led to 
dx 1 x—a 
= —lo , 
J va 2a? xt+a 
It made sense to try to use this for a = i = y— 1, in spite of the fact that it was 
known that the arctan entered at this point. But this too could be reduced to 
logarithms of complex numbers. Hence the great interest in logarithms. 


One would have come closer to the solution if one had used suitable limits of 
integration in 


dx L x-l 
[euq 7 OB ti 


and introduced known values of the arctan function: 


1 
5, Los! — log 1], 


x 
| 
— 
8 

o 
BO 
+ 
—_ 
I 


+00 1 
= ——— = = |log1 — log(-1 
[ ~ E pogs ~ ox —0). 


0 6x 1 
=f ayy = Gllew(-) — log A]. 


Nilay wl] 43 


Already the first of these relations shows that 0 is not the only value that can be 
assigned to log1, and the two other equations suggest log(—1) = +iz. Euler’s 
algebraic solution could not answer the question of which value to choose in a 
particular case. For this one had to wait another century, when Riemann intro- 
duced his surfaces which made the handling of many-valued functions transparent. 
It is worth noting that it was the problem of integration of algebraic (and in 
particular rational) functions that gave rise to Riemann surfaces. 

When it comes to investigations in the Jntroductio connected with other elemen- 
tary functions, we mention the power-series expansion of the logarithm function, 
related to the solution of y = (1 + x/N)” for x. Putting y =1+h, x = log(1 +h) 
we obtain 


log(1 +h) =(1 +h)” -1. 
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After Newton, it was legitimate to use the binomial series for arbitrary real 
exponents. Hence, for |h| < 1, the series: 


1 N 1/1 N 1/1 1 
jog(t +h) = N-Sht Se (2-1 e (pally 2) 


N 2 N\N 3! N\N 
he he 
+o eh—-—+—-— + 
2 3 


3. CRITICISM AND EARLY ATTEMPTS AT JUSTIFICATION. The Achilles 
heel of Leibniz’s calculus was the “equality” x + dx =x. The most vehement 
criticism of the inadequate justification of the calculus came from Paris, the 
contemporary center of research that had provided Leibniz with powerful stimuli, 
derived from contacts with Huygens and the study of the writings of Pascal, during 
his stay from 1672 to 1676. On 2 February 1702 Leibniz wrote a detailed letter to 
Varignon, his Paris ally, which the latter published immediately. A few years later 
the Parisians became converts to the new calculus, and after that there were 
virtually no objections to it on the European continent. 

Leibniz provided many justifications. He argued that, much like imaginary 
numbers, infinite and infinitely small distances can be used without reservations as 
ideal concepts, without having to be regarded as real objects. It is more convenient 
to introduce once and for all the concept of the incomparably small instead of 
always having to speak about magnitudes capable of unlimited decrease. Neverthe- 
less, as is shown by the letter to Wallis quoted earlier, it is this very turn of phrase 
that justifies the calculus, by showing that the error is invariably less than any 
assignable magnitude. 

This is, so to say, a pragmatic argument: The calculus of differentials is more 
convenient than talking about constantly decreasing magnitudes, more convenient 
than the method of limits advocated by Newton. Then comes a ‘metaphysical’ 
argument, an appeal to the general continuity principle. Leibniz asserts that the 
rules of the finite hold for the infinite and conversely. The immediate question is: 
Which rules? Surely the algebraic rules, which Leibniz carries over without 
hesitation from the real numbers to differentials. But that is not all. Euler’s use of 
infinitely large natural numbers also accords with the principle of permanence of 
the rules of operation. Leibniz wrote too little, and what he wrote includes no 
systematic account. Hence it is impossible to draw unassailable conclusions, based 
on his usage, concerning the meaning of general pronouncements on the validity of 
rules. This is different in the case of Euler. 

Speaking of Euler. While his use of the infinitely large and small in the 
Introductio and in his papers on series can be made to harmonize with the 
principle of permanence of rules of operation, a similar reconciliation of usage and 
theory is well nigh impossible when it comes to the chapter On the infinite and on 
the infinitely small in his book on the differential calculus (1755). This chapter has 
generated more confusion than clarity. I won’t go into this matter here and refer 
the interested reader to pp. 206-211 of my book of 1986. 

The persuasive power of the calculus derived from the incontestable successes it 
amassed in the 18th century. But the need for a clear justification was imperative. 
In this connection we must mention, first of all, J. L. Lagrange, who took over 
Euler’s position in Berlin in 1766. In 1784, one year after Euler’s death, he 
announced a prize competition of the Berlin Academy. He alluded to the fact that 
many philosophers and mathematicians view the notion of the infinite as inconsis- 
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tent, and posed the problem of explaining how it is possible to derive so many true 
results from an inconsistent assumption. In addition, participants were to provide a 
genuinely mathematical principle that would replace the notion of the infinite 
without, however, leading to a loss of simplicity and clarity in the derived results. 

None of the approximately two dozen responses satisfied Lagrange, so that in 
1797, while in Paris, he published his own proposed solution. The long title of his 
book states clearly what he had in mind: A theory of analytic functions, including 
the foundations of the differential calculus, free of all consideration of the 
infinitely small, of the vanishing, of limits and fluxions, returned to the analysis of 
finite magnitudes. 

Lagrange’s analytic functions are (formal) power series 


f(x th) =f(x) th-g(x) +h]. 


He wanted to work with them without involving convergence arguments. His 
attempt failed, but it has had an abiding influence. The g(x) in the preceding 
formula stands for the derivative, which he denoted by the symbol f’(x). Since 
then, the derivative has occupied a foreground position, next to the differential 
quotient. Since Lagrange also investigated properties of linear order, he gave the 
first remainder estimates. We add that when it came to applications, he too 
thought in pragmatic terms. In his analytical mechanics he computed with differen- 
tials. 


4. NEW DEMANDS FROM PHYSICS. Around 1750 it became clear that the 
algebraic-algorithmic notion of functions was inadequate from the viewpoint of 
physical applications. The famous example is that of the vibrating string. Daniel 
Bernoulli (1700-1782), son of Johann, found that an adequate mathematical 
treatment of this phenomenon required the use of series such as Ub, sin nx. He 
and Euler realized that such series represented fairly arbitrary functions, whose 
behavior was very different from that of hitherto admissible functions. In particu- 
lar, they did not have to obey the same “law” throughout their respective domains 
of definition. Thus 


sin2x  sin3x 
+ 
2 3 
is equal to (a7 — x)/2 for 0 <x < 2m and 3a — x)/2 for 27 <x < 47. The sum 
function is piecewise linear and has a jump discontinuity at all integral multiples of 


27. Euler obtained this result by summing a geometric progression with ratio 
q = re": 


sin x + 


°° . 1 1 — re~* 
>) re = ———_— = ce ——— 
a, 1 — re” (1 — re’*)(1 — re™'*) 
1—rcosx +irsin x 
1 -2rcosx +r? 
Its real part is 
1—rcosx 
1+rcosx+r’cos2x+r°cos3x+t°" = > - 
1—2rcosx+r 


This is valid for 0 <r< 1. Nevertheless, we follow Euler and use this formula 
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uncritically for r = 1. The result is 


1 + cos x + cos2x + cos3x+-° = 5° 
(1754; see Opera ser. I, 14, pp. 542-584 and 15, pp. 435-497.) Termwise integra- 
tion of the latter series yields 


oe) 


sinnx  17w—-X 
L => for0 <x < 27. 


n 


The value of the constant of integration is obtained by putting x = 7 in the series. 

What Euler does here is different from what he did 20 years earlier. He no 
longer computes with possibly infinite values of a divergent series but assigns to it 
the appropriate numerical value of the expression that gave rise to it. In the case 
of our geometric progression, the expression in question is (1 — g)~’. In three 
notes written between 1771 and 1773, the physicist Daniel Bernoulli adopted 
this algorithmic approach and developed it further (D. Bernoulli, Werke Bd. 2, 
Basel 1982; the editor regards this as an application of the Leibniz principle of 
continuity). 

We may view this method as highly questionable, but its successes were beyond 
doubt. Just as in the case of differentials, we are challenged to provide a clear 
justification for a successful modus operandi. I provide such a modern justification 
in my book of 1986 (see pp. 181 ff.). 

The modern field-theoretic conception of physics began with the theory of heat 
developed after 1807 by J. B. Joseph Fourier. This resulted in a viewpoint that 
regarded all physical magnitudes as solution functions of (partial) differential 
equations. Since these functions were supposed to correspond to real measure- 
ments of physical magnitudes, it followed that the only useful functions were those 
whose values for real inputs were real. As a result, in the first instance, functions 
given by formal expressions, including divergent series, were marginalized. 


Ahornweg 23 
D-64367 Muhltal 
Germany 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 
with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, M. J. Pelling, Richard Pfiefer, Leonard Smiley, John Henry 
Steelman, Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before October 31, 1997; Additional information, such as generalizations and 
references, is welcome. The problem number and the solver’s name and address 
should appear on each solution. An acknowledgement will be sent only if a mailing 
label is provided. An asterisk (*) after the number of a problem or a part of a 
problem indicates that no solution 1s currently available. 


PROBLEMS 


10585. Proposed by Roger Pinkham, Hoboken, NJ. Three points are selected independently 
and at random in a disk of radius one. What is the average distance of the third from the 
line determined by the first two? 


10586. Proposed by Donald E. Knuth, Stanford University, Stanford, CA. A certain matrix 
has m rows andn = 1 + k2 columns. All entries of the matrix are 1, and the dot product 
of any two columns is less than or equal to 0. Prove that the total number of positive entries 
in the matrix is at most 5m(n +k), and construct a matrix that achieves this upper bound. 


10587. Proposed by Dave Witte, Oklahoma State University, Stillwater, OK. Suppose a 
particle moves continuously in the plane in such a way that the distance between its position 
at any two different times depends only on the difference between the times. Prove that the 
particle travels either on a circular arc or on a straight line. 


10588. Proposed by Herbert S. Wilf, University of Pennsylvania, Philadelphia, PA. Show 


that - np 
. l l we —TT 
[em (144454) - ee 
wey 


where y is Euler’s constant. 


10589. Proposed by Paul Bateman, University of Illinois, Urbana, IL and David Bradley, 
Simon Fraser University, Burnaby, BC, Canada. 
(a) Prove the identity 
2k-1_y 

Dy DEMO Gy + Dk 2EDEM? (y + QE! — 1/2), 

j=0 
where n(j) is the number of ones in the binary expansion of the nonnegative integer /. 
(b) Use part (a) to infer that there is a positive integer s = s(k) such that every integer n 
is expressible in the form n = exit + Eoxs se e,xk in infinitely many ways, where 
€; = +1 for 1 <i <-s and where x1, x2,..., Xs are distinct positive integers. 
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10590. Proposed by David Cox, Amherst College, Amherst, MA. Fix an integer n > 2 and 
let dj, dz, ..., dy be positive integers with no common divisor greater than 1. Suppose that 
d; divides dj + ---+d, foralli =1,...,n. 

(a) Prove that djd2 --- dy, divides (dj +--+ dy)"~?. 

(b) For each n > 3, give an example to show that the exponent in part (a) cannot be made 
smaller. 


10591*. Proposed by Jeffrey C. Lagarias, AT&T Research, and Thomas J. Richardson, Bell 
Laboratories, Murray Hill, NJ. Let F,, Fo, F3, F4 denote the faces of a tetrahedron. For 
i = 1, 2,3, 4, let a; denote the solid angle of the vertex opposite face F;, where the measure 
of a solid angle is normalized so that a full solid angle is 1, and let 6; denote the area of F,, 
where the unit of area is normalized so that the tetrahedron has surface area 1. 

(a) Prove that Bj > aj. 

(b) Generalize to n dimensions. 


SOLUTIONS 


Subcollections with Large Intersection 


10383 [1994, 473]. Proposed by Kevin Ford (student), University of Illinois, Urbana, IL. Let 
B,,..., Bs denote subsets of a finite set B, and let A; = #(B;)/#(B) andA =A ,+---+As. 
Show that, for every integer ¢ satisfying 1 < t <A, there existr),...,r; withr; <--- <1; 
and 


—] 
#(B,, N---NB,,) >(A—t+ v(7) #(B). 


Solution by Mark Bowron, Houston, TX. \t suffices to show that the quantity on the right is the 
average size of the intersection over all (7) choices of {r;}; in particular, N > (A—t+1)#(B), 
where WN is the sum of the intersection sizes. With B = {x1,..., xn}, let uz be the number 
of sequences {r;} such that x, € ();_, B;,, and let v; be the number of members of {B;} 


that contain x,. Then 


N=)ou=>> (“) >) (ue —t +1) = ) | #(B)) — nt +n = (A —t + 1HCB). 
k=1 k=1 i=] 


k=1 


Editorial comment. The proposer and Victor Hernandez observed that the result holds also 
for events B,,..., Bs inaprobability space, where A; = P(B;) and we seek a lower bound 
on max P(() B,,). Some solvers observed that the bound holds also when A <t <A-+1. 
A. N. ‘t Woord observed that F: [0, co) — [0, co) defined by 


F _ 0 if0O<x<t-1 
=) ()ax@—1)--@—ft+D/t! ifx>t— 
is aconvex function. Hence 4 ype F (ue) = F(t > #B;) = F(A). Thus whenl <t <A 


the lower bound can be improved by replacing (A — t + 1) with (*). In a more detailed 
solution, John H. Lindsey II obtained both latter extensions. 


Solved also by R. Barbara (Lebanon), R. J. Chapman (U. K.), V. Hernandez (Spain), J. H. Lindsey II, O. P. Lossers 
(The Netherlands), R. J. Simpson (Australia), P. Tracy, A. N. ’t Woord (The Netherlands), USA Problem Group, and 
the proposer. 
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A Lower Bound on Correlation 


10384 [1994, 474]. Proposed by Franklin Kemp, East Texas State University, Commerce, 
TX. Suppose xj < x2 < --- < x, and yy < yg <--- < yy. Define the correlation 
coefficient r in the usual way: 


>); G4 — X)Qi — Y) 


Vy Lili — X)? - 0% — y)? 


where x and y are the average values of the x; and y;, respectively, and the sums run from 
1 ton. Show that r > 1/(n — 1). 


r= 


Solution by Robin J. Chapman, University of Exeter, Exeter, U. K.. By replacing x; by x; — 
and y; by y; — y we may assume that x = y = O. Let 


C = {(x1,%2,...,X%n) € R": x, < x0 < +++ < Xn, XH} + XQ4+--- +x, = 0}. 


Denote the Euclidean inner product of two elements u,v of R” by u- v. Letu; = 
(1,1,...,1,0,...,0) (with 7 1’s) forO < j < n and vj; = (1/n)u, — (1/j)u; for 
1<j <n—1. Note that each v; is inC. 

Given x = (xj, X2,...,Xn) € C, a computation shows that 


=D (xj41 — xj) Vj, 


and sox = )/; sjvj with each s; > 0. Similarly, giveny € C,y = )/; tiv; with each t; > 0. 
Now let aij = v; - v;. [fi < j then 


1 1 1 1 1 1 
aij = (—U, — -u;)-(-u,—- =u; )=-— -, 
n i n J j on 
independent of i, and so 


jj i 
VGiFjj Siig Vii 


Thus, the desired inequality is valid among the v;. Using this, we find 
1 1 
X-vV= a;;s;it; > —— st;./aiaj; = — Ss; vs t; 
y ud, ijSilj = dd, iV GIG)] = dX vai iN 4Gij 
Since the Cauchy-Schwarz inequality gives aj; = v; - Vj < ./ajiaj;, we have 
2 
x= DD away sD Dasivanag = (Lavan) 
i J i J i 


and similarly 


2 
y-ys (> iva) : 
J 


Thus, we conclude that 
xy 1. 


~ ERT 


as required. 
This argument can be refined to show that the inequality is strict for n > 3, but 1/(n — 1) 
is the best possible constant . 
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Editorial comment. As some solvers remarked, it suffices to require only x} < x2 <--- <x 
with x; < x, (and similarly for the y;), but one must demand that n > 1. 

Can Anh Minh noted that the result appears in Christopher Bradley and Nick Lord, Com- 
puting Spearman’s rank and the product-moment correlation coefficients, The Mathematical 
Gazette 77 (1993), 84-88. 


Solved also by P. G. Kirmser, O. Krafft & M. Schaefer (Germany), R. Kreminski, J. H. Lindsey II, L. E. Mattics, D. P. 
Walsh, T. White, and the proposer. 


n 


An Application of Snake Oil 


10388 [1994, 474]. Proposed by E. Sparre Andersen and Mogens Esrom Larsen, Kgben- 
havns Universitet, K@benhavn, Denmark. Find 


(7 3”) 


where n and p are positive integers. 


Solution by H. van Haeringen, Delft University of Technology, Delft, The Netherlands. The 
proof uses methods described in H. S. Wilf, generatingfunctionology (second ed.), Academic 
Press, 1994, which we denote by [g]. Let S,,(p) be the sum in question; we prove that 


+1 
Sn(p) = (—1)?2"~*? (°? - 7) 
Pp 


First we use the “Snake Oil method” [g, p. 118-130] to prove that 


we —Asin aye“ ’) _ cos(1 + 2a)0- (1) 


na cos @ 


Letting L, be the left-hand side of (1), we have 


So Laz! = Lc —4 sin ay (OF? a = —4 sin” a at 


a=0 a=0 
1-z _ 1—zZz 
7 (1 —z)2+ 4zsin?@  1—2zcos26 +z? 
1 ei? e~i9 “. cos(1 + 2a)6 a 
~ 2cos6 =. r em) = coso 


a= 


This proves (1) when a is nonnegative integer. On both sides, the coefficient of 6” in the 
power series expansion is a polynomial in qa; therefore (1) holds for all real numbers a. 


By (1), 


n\ cos (1+ (n — 3)/2—k)@ 
v- —4 sin? 0)? S,(p) = (ee 


This sum equals 2@—D/2( 1+ cosd@ yerVe 


cosu = (2! + eit) / 2 and the binomial theorem. Letting u = sin? 6, we obtain 


(n+1)/2 
gn-ya(itvi—u) 9 


/ cos @ by a straightforward computation using 


Y\(—4u)? Sa(p) = 
p=0 


V1l—u 
- 1)/2 
2” 1-Jl-u ald = (2) ("> ) 
[= u/2 a4 » S 
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where the last step follows by a well-known identity [g, equation 2.5.15]. The formula for 
Sn(p) follows by equating coefficients of u” on each side. 


Editorial comment. The “Snake Oil method” to evaluate a sum involves forming a gener- 
ating function over a fixed parameter a, interchanging the order of summation, and then 
performing the inner sum. The proposers’ proof uses formulas that they developed to study 
more general sums; van Haeringen also gave two proofs using hypergeometric series and a 
fourth proof based on a recurrence for S,(p). Herbert S. Wilf and Istvan Nemes provided 
computer proofs based on the algorithm of D. Zeilberger, A fast algorithm for proving ter- 
minating hypergeometric identities, Discrete Math. 80 (1990), 207-211. Wilf notes that 
the computer method now provides a general approach to such sums. 


Solved also by I. Nemes (Austria), H. S. Wilf and the proposers. 


Tower of Power 


10389 [1994, 574]. Proposed by Raphael M. Robinson, University of California, Berkeley, 
CA. Find all solutions of the equation 


am bn 


where m > 1,n > 1, the a; and b; are integers with 2 < aj < 4 and2 < b; < 4, and 
a, # by. 


Composite solution by all solvers, with a generalization by the editors. We show more 
generally that for N > 2 there are finitely many solutions with m > 1,n > 1, ay # bj, and 
each a;, bj € {2,..., N}. In fact, the upper bound N is needed only for 1 <i <3, and the 
solutions for each N can be computed effectively. We represent the desired equation using 
the notation 


flaj,...,Qm]] =[b1,..., Dn]. (1) 


By unique factorization, it is clear that aj and b; have the same prime factors. We 
consider each such a, and b; with ay 4 b; and 2 < aj, bj < N. Since a; # bj, thereisa 
prime p such that p’ ||a; and p*||b,; where r and s are distinct positive integers (the notation 
p’ ||a means that p’ is the largest power of p dividing a). We may assume thatr < s. By 
considering the largest power of p that divides both sides of (1), we deduce that 


rlla2,...,Qm]] = Silb2, ..., by]. (2) 


The cases in (2) where az and bz have different sets of prime factors can be solved easily. 
If q is a prime dividing b2 but not ao, then [[q, b3, ..., by, ]] divides r. Thus [[b3, ..., bn] < 
log, r, and [b),...,5,] < N,N, log, rj]. In particular, there are finitely many such 
values. 

We may suppose then that a2 and b2 have the same prime factors. Since r < s, there 
are nonnegative integers r’ and s’ with r’ < s’ and a prime q such that q’” |r and q°® lls. 
From (2), we must have q|az, and hence also q|b2, since az and b2 have the same prime 
factors. Let c and d be positive integers such that q‘||a2 and qt |[b2. By considering the 
largest power of q that divides both sides of (2), we deduce that 


r’ + clla3,..., 4m] = s' + dl[bs,..., bnll, 


and thus cl.a3,..., @m]] > df[b3,..., Dn]. 
Since 2 < a3, b3 < N, we need only find all nonnégative integer solutions x and y to 
the equations cu* — dv” = w where c, d, u, v, and w are integers satisfying 


l<c,d<log,N, 1<w<log,log,N, and 2<u,v<N. 
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That this can be done follows, for example, from Theorem 1.1 of Shorey and Tijdeman, 
Exponential Diophantine Equations, Cambridge Tracts in Math. 87, Cambridge, 1986. 

When N = 4, the problem reduces to solving the diophantine equations 2* — 3” = 1 
(with solution set {(1, 0), (2, 1)}) and 3* — 2” = 1 (with solution set {(1, 1), (2, 3)}). These 
solutions are well known. They yield eleven solutions to the proposed problem: 


a4 ag, a4? 2 242 2% 24%, 2? aah 


32 23 32 22 32 4 4 3 2 3 

2 =4*% OF = 4 Q® = gt, 2% = 4% and 2% = 47. 
Editorial comment. All solvers used the same approach, restricted to the special case N = 4, 
but four solvers omitted one or more of the solutions. 


Solved by R. Barbara (Lebanon), L. Cagliero (Argentina), R. J. Chapman (U. K.), J. Christopher, Z. Franco, H. W. 
Guggenheimer, R. Holzsager, N. Komanda, J. Kuliyev, J. H. Lindsey II, O. P. Lossers (The Netherlands), J. Merickel, 
C. A. Minh, A. Pedersen (Denmark), J. Rostand (Canada), T. Tran, T. White, A. N. ’t Woord (The Netherlands), NSA 
Problems Group, Prague Problem Solution Group (Czech Republic), and the proposer. 


Growing Inequalities 


10391 [1994, 574]. Proposed by Emre Alkan (student), Bosphorus University, Istanbul, 
Turkey, and the editors. If a,, a2, ..., A, are real numbers with a; > a2 > --- > ay, and if 
@ is aconvex function defined on the closed interval [a,,, a)], then 


Yb (an ans = Y> PO (aeri dak 
k=I k=1 


with the convention that a,.; = qd. 


Solution I by Zachary Franco, Butler University, Indianapolis, IN. For n = 1 and 2 the 
inequality is trivial. Let a), a2, ... be an infinite nonincreasing sequence of real numbers 
and let Sy = ($(dn)a1 — P(a1)dn) + Y=j (ae )aK+1 — (4e+1)ax). We must show 
Sn = 0. 

By Jensen’s inequality for convex functions, d(ax + By) < ad(x) + Bo(y). Put 
An — An+1 a, — ay 


a= ; = ————,, X=a), and y=4n4] 
a, — An+1 a) — an+1 


and simplify to obtain 


(An — An+1)@(a1) + (a1 — An) (An+1) + (Gn41 — 41)6(an) = 0. 


The left side of this inequality is precisely S,+41—5$,, which implies that S, is anondecreasing 
sequence and hence must always be nonnegative. 


Solution II by Richard Holzsager, American University, Washington, DC. Fix n. The two 
sides are clearly equal for (x) = 1 or (x) = x, and since both sides are linear in ¢, we 
can subtract off the linear function that agrees with ¢ at the endpoints and reduce to the 
case where $(a;) = $(a,) = O. Eliminating the terms with these factors allows the desired 
inequality to be rewritten ae b (ax) ( ae+1 — Ax-| ) > 0. But since ¢ is convex and zero 
at the endpoints, each ¢(a,;) is nonpositive, as is the other factor. The inequality follows. 


Editorial comment. The left side of the three term inequality used in Solution I may be inter- 
preted as the value of the well-known determinant formula for the area of the triangle with 
vertices (ai, (aj) ), i = 1,n,n+ 1. The convexity condition determines the orientation 
of the triangle and hence the sign of this determinant. The solution of O. P. Lossers used 
this interpretation. 

The National Security Agency Problems Group gave a more visual form of Solution 
II. They began their solution by subtracting }°;_ 1 ¢(ax)ax from each side of the desired 
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inequality. The resulting sums were then written with factors of 6, = ag — ag+ 1 in each 
term. This led to sums arising in the trapezoidal approximation to { :, p(x) dx. 

An algebraic form of this problem appeared in Crux Mathematicorum as Problem 10- 
3, and it was discussed in detail in the “Olympiad Corner” section of that journal [1980, 
106—108 and 129-130]. 
Solved also by J. Alvarez (Spain), D. Borwein (Canada), R. J. Chapman (U. K.), J. H. Chyung, R. Ehrenborg (Canada), 
E. Escalona-Fernandez, J.-P. Grivaux (France), H. van Haeringen (The Netherlands), E. A. Herman, K. Hinderer & 
M. Stieglitz (Germany), J. Howard, F.-A. Izadi (Iran), G. Keselman, B. S. Kim (Korea), P. G. Kirmser, J. H. Lindsey II, 
O. P. Lossers (The Netherlands), P. McCartney, K. D. McLenithan, J. Merickel, C. A. Minh, A. Pedersen (Denmark), 
K. Schilling, M. Vowe (Switzerland), A. N. ’t Woord (The Netherlands), NSA Problems Group, Prague Problem Solution 
Group (Czech Republic), USA Problem Group, and the WMC Problems group. 


The Effect of an Alternating Series 


10398 [1994, 682]. Proposed by Leroy Quet, Denver, CO. Show that 


1yte 1)"tle 
rain »> =m: m-(n+1)! 
Solution by Michael Vowe, Therwil, Switzerland. Let H,, be the sum of the first n terms of 
the harmonic series, so H, = nel 1l/m = fo (t” — 1)/(t — 1) dt. Consider the function 
f represented by the absolutely convergent infinite series 
ca ah Py Qttxt — x” f Soest 
0 


f=) ype 0 t—-1& (n+ 1)! t(t — 1)x dt. 


We have f(1) = 0°; Ha/(nt+ I! = fp (ef —1—te+H/@ — 1) at. 
Consider also the function g represented by the absolutely convergent infinite series 


n — a 
0 


ae) =) (n+ 1)! 1-14 (n+ 1)! 


1 x —xt oy 1 ,(i-t)x t—1)e* —-t 
-| e (<—— -@*-») ar = | ett bert 
9 t-1 t 0 (t —1)t 


Setting s = 1 —t¢ yields 


1)"t+1e _ 1 
TORS Secesy a Hs =f Coste) se FS) ds = f), 


dt 


as desired. 


Solved also by J. Anglesio (France), D. Beckwith, D. Borwein (Canada), P. Bracken (Canada), D. Bradley, E. Braune 
(Austria), P. Budney, R. J. Chapman (U. K.), H. Chen, D. A. Darling, R. Fraties, C. Georghiou (Greece), M. .L. Glasser, 
E. Hertz, M. Hoffman, R. Holzsager, G. L. Isaacs, G. Keselman, B. G. Klein, J. H. Lindsey II, O. P. Lossers (The 
Netherlands) A. R. Miller, D. K. Nester, A. Nijenhuis, A. E. C. Nifiez (Colombia), H. Prodinger (Austria), R. Sprugnoli, 
R. Stong, D. B. Tyler, C. Y. Yildirim (Turkey), NSA Problems Group, USA Problems Group, WMC Problems Group, 


and the proposer. 
Bounding a Binomial Sum 


10400 [1994, 682]. Proposed by Itshak Borosh, Douglas Hensley, and Arthur M. Hobbs, 
Texas A&M University, College Station, TX, and Anthony Evans, Wright State University, 
Dayton, OH. Determine the set of all pairs (n, t) of integers with 0 < t <n and 


7 (;) 7 
» k SW 
r=0 t! 
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Solution by Stephen M. Gagola, Jr., Kent State University, Kent, OH. The inequality holds 
precisely when t = 4 andn > 7, whent = 5 andn > 8, and when 6 < t <n. 

When 0 < ¢ < 3, the left side of the inequality is a polynomial in n of degree t whose 
leading term is n' /t! and whose remaining nonzero terms have positive coefficients. Thus 
the inequality cannot hold. 

When t = 4, the inequality (times 24) is n* — 2n? + 11n? + 14n + 24 <n‘. This fails 
forn € {4,5, 6}. Forn > 7, we have 11n? + 14n+ 24 < 11n?+2n? +n? = 14n? < 2n3, 
and the inequality holds. 

When t = 5, the inequality (times 120) is n° — 5n* + 25n? + 5n? + 94n + 120 < n°. 
This fails forn = 5. Forn > 6, we have 25n? +5n*+94n+ 120 < 25n?4+-n°43n34+n3 = 
30n? < 5n*, and the inequality holds. 

Whenn = t > 6, the inequality becomes 2”n! < n”, which holds inductively for n > 6. 
With these statements as basis, we now complete the proof by induction on n. For the 
induction step, we consider n > ¢ and use the induction hypothesis to compute 

-. /n ‘.(n—-1 i (n-1 n—1)' (n—1)7! 
» (7) => ( k Jeo) <r G=p) 


k=0 k=0 k=1 


—1) t — 1) 1 \! t 
~@=) (4 Ma (4 ft 
t! n—1 t! n—1 t! 


Solved also by J. Anglesio (France), M. Ascher, J. T. Bruening, J. Christopher, J. S. Frame, J. Graham & J. B. Klerlein 
& S. Sportsman, R. Holzsager, G. L. Isaacs, N. Komanda, J. Lauret (Argentina), J. H. Lindsey II, G. T. Lee (Canada), 
O. P. Lossers (Netherlands), G. Myerson (Australia), I. Nemes (Austria), V. Novakov (Bulgaria), R. Stong, D. B. Tyler, 
NSA Problems Group, WMC Problems Group, and the proposer. 


The Average Order of an Element in a Group 


10410 [1994, 911]. Proposed by Frank Schmidt, Arlington, VA. Let G be a finite group. 
Define a(G) to be the average order of an element of G. If |G| 4 1, can a(G) be an 
integer? 


Solution by Peter L. Montgomery, San Rafael, CA. Yes. First, we make two observations. 
(1) If G; and G2 are finite groups and gcd (|G;|,|G2|) = 1, then a(G; x G2) 
a(G )a(G2). 
(2) If p is a prime and C(p) denotes the cyclic group of order p, then a(C(p)) 
(p* — p +1)/p. 
To prove (1), observe that the order of an element (g;, g2) of G; x G2 is always the least 
common multiple of the orders of g; and g2. The assumption guarantees that this is the 
product of the orders of g; and g2. Thus, (1) follows by grouping the terms in the sum of 
the orders. To prove (2), note that C(p) has p — 1 elements of order p and 1 element of 
order 1. 

We look for a cyclic group G = C(p,) x C(p2) x --- x C(pn), where the p; are 
distinct primes. This gives the example we seek if p, p2--- py divides (pt — py +1) 
( PS — po+1)---( pe — Pn +1). Any prime factor of the latter product (other than 3) must 
be congruent to 1 modulo 3, so try py = 7. Continuing: Pt — pi +1 = 43, sotry p2 = 43; 
PS — p2+1= 1807 = 13 x 139, so try p3 = 13; ps — p3 +1 = 157, so try pa = 157; 
Pa — pat+1 = 24493 = 7 x 3499. Since we now have a multiple of p,, the cycle is 
complete. With G = C(7) x C(13) x C(43) x C(157) = C(614341), a(G) is the integer 
139 x 3499 = 486361. 

A noncyclic abelian example can be found in a similar way. Let G = C(13) x C(13) x 
C (23) (of order 3887). Then a(G) = (13° — 13 + 1)(237 — 23 + 1)/3887 = 285. 
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Editorial comment. Douglas B. Tyler noted that there are at least 100 examples of such 
groups, and raised the question of whether there are infinitely many examples. One example 
can often spawn others, since (1) implies that if a(G) is an integer, d is any divisor of a(G) 
relatively prime to |G|, and H is any group of order d, then a(G x H) is an integer. 

Nilpotent groups are characterized as direct products of their Sylow subgroups (see 
Marshall Hall, Jr., The Theory of Groups, Macmillan, 1959, ch. 10). Knowing a(G) for 
all groups G of prime-power order allows a systematic study of this problem for nilpotent 
groups. In particular, one can verify that the smallest nilpotent group G for which a(G) is 
an integer is the abelian group of order 3887 that has already been mentioned. 

The examples in the selected solution were most frequently cited. Another popular 
example was constructed by taking the direct product of C(5) with the nonabelian group of 
order 21. John H. Lindsey II proved that this is the smallest group with the desired property. 
The proof consists of showing that any smaller example must be nilpotent by using the fact 
that smaller numbers have the form p“q, and two more simple observations: 

(3) The sum of the orders of the elements of a group is always odd. 

(4) If p divides |G|, then the sum of the orders of the elements of G is congruent modulo 

p to the sum of the orders of elements in the centralizer of a p-Sylow subgroup. 

First, let a p-Sylow subgroup act on G by conjugation. Elements in the same orbit have the 
same order, so only the fixed points contribute to the sum modulo p. This gives (4). Indeed, 
this centralizer further splits as a direct product of a group whose order is a power of p and 
one whose order is relatively prime to p. One could further restrict to the latter factor, since 
the other elements have orders that are multiples of p. The proof of (3) is similar, starting 
from the fact that an element has the same order as its inverse. 


Solved also by R. Holzsager, J. Lauret (Argentina), J. H. Lindsey II, J. H. Nieto (Venezuela), R. Stong, D. B. Tyler, 
A.N. ’t Woord (The Netherlands), and the proposer. 


A Bernoulli Convolution 


10416 [1994, 912]. Proposed by Kwang-Wu Chen (student), National Chung Cheng Univer- 
sity, Chia-Yi, Taiwan, Republic of China. The Bernoulli numbers B, (for n = 0, 1, 2,...) 


are defined by 
t Bt" 
et —1 » nt’ 
n=0 


which converges for |t| < 27. Also, for each nonnegative integer n, the Bernoulli polyno- 


mial B,(x) is defined by 
“(n 
Br(x) = >- @ By—4x*. 
k=0 


For integer m > 1 and arbitrary constants a and B, prove 


3 (7) By (1) Bm—&(B) = —(m — 1) Bm (a + B) + m(o + B — 1)Bm—1(o + B). 


k=0 
Solution by David Borwein, University of Western Ontario, London, Ontario, Canada. Let 


Lm and R,, denote respectively the left side and the right side of the desired equation. It is 
well known and easily verified that 


oo B t” t Xt 
Sa 8 FE for it < 2m. 
mar n! ef — ] 


For |t| < 277, we thus have 


OO Limnt™ 5 Belo > Bitbye — tet —teBE tr ela FB) 
dX m! k! is |! ~ ef —J ef—1 (et —1)2’ 


m=0 = 
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oO Ry t™ oS (m—1 m m 
> me --) ents PO -a+Bp- 3 Bn- are 
= yd (=) + (a+ p- ype 
dt \ ef —1 1 
re (a+B)t t2e(atB)t ot t2e(atB)t 
= —(@ + BJ _ + opr t@+6-D=—- 
a (atBytgt t2e(atB)t t2e(atB)t 
= aa a @ 


It follows that L,, = R,, form > 1. 


Editorial comment. Carl Axness pointed out that the identity appeared as series (50.11.2) 
in Eldon R. Hansen, A Table of Series and Products, Prentice Hall, 1975. Bruce Dearden 
and F, T. Howard independently generalized the identity to Bernoulli polynomials of higher 


order. With BY” (x) defined by )7 °°, BM” (x)t” /n! =t™e*' /(e! — 1)”, they proved that 


k=0 
m (pt+q-1 a+ B +q-1 
p+q-1 p+q—1 
which reduces to the present identity when p = q = 1. 
Solved also by E. S. Andersen & M. E. Larsen (Denmark), J. Anglesio (France), T. M. Apostol, C. Axness, D. Bradley, 
J. C. Carey, R. J. Chapman (U.K.), B. Dearden, R. Ehrenborg (Canada), C. Georghiou (Greece), M. Hauss (Germany), 
F. T. Howard, J. H. Lindsey II, O. P. Lossers (Netherlands), I. Nemes (Austria), A. Pedersen (Denmark), R. Pirastu 


(Austria), A. Stenger, A. N. ’t Woord (Netherlands), M. Vowe (Switzerland), D. Zeilberger, Anchorage Problem Solutions 
Group, NSA Problems Group, WMC Problems Group, and the proposer. 


A Consequence of Fermat’s Little Theorem 


10417 [1994, 1013]. Proposed by Proposed by Charles Vanden Eynden, Illinois State 
University, Normal, IL. Characterize the positive integers m such that 


"“=1 (modn) = m=1 (mod n). 


Solution by Todd H. Trimble, Loyola University, Chicago, IL. The only such integers are 
m = 1,2. If m > 2, then the binomial expansion yields 


i 2 
mm" = (14+ (m—1))"” = 1 (mod (m ~ 1)”), 


but m 4 1 (mod (m — 1)*). 

Since m = 1 works trivially, it suffices to show that the hypothesis never holds when 
m = 2. Suppose 2” = 1 (mod n) for somen > 1. Then 2” = 1 (mod p), where p is the 
least prime dividing n. Clearly p # 2, so p is an odd prime. Since n is not divisible by 
any prime smaller than p, it is relatively prime to p — 1. By the Euclidean algorithm, there 
exist integers a, b such that an + b(p — 1) =1. Using Fermat’s Little Theorem, we now 
obtain the contradiction 2! = 2”42(?-) = 1 (mod p). 


Editorial comment. Gerry Myerson observed that the case m = 2 is Problem A-5 on the 
1972 Putnam exam (this MONTHLY 80 (1973), 1017-1028) and showed that the implication 
fails also for negative m. 
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Solved also by R. Barbara (Lebanon), J. T. Bruening, D. Callan, R. J. Chapman (U. K.), J. Christopher, G. Ehrlich, 
R. Holzsager, M. Hudelson , J. H. Lindsey II, O. P. Lossers (The Netherlands), G. Myerson (Australia), A. Pedersen 
(Denmark), M. Reid, A. A. Tarabay (Lebanon), D. B. Tyler, H. Widmer (Switzerland), A. N. ’t Woord (The Netherlands), 
NSA Problems Group, and the proposer. 


A Sequence of Polynomials with Positive Coefficients 


10420 [1994, 1014]. Proposed by C. R. Selvaraj and S. Selvaraj, Pennsylvania State Uni- 
versity, Shenango Campus, Sharon, PA. Let 


ki +1 k k . o20k 
gi(n) =) ——— (@t 2k 20 + 1) + e7n") 
k=i ° 


Prove that, for alli > 1, gj(m) is a polynomial in n of degree i — 2 and g;(n) > 0 for all 
née Nandi EN. 


Composite solution by many solvers. We prove something stronger: For i > 2, g; isa 

polynomial of degree i — 2 with all coefficients positive. Since }\7> _9(k — i + 1)x*/k! = 

(x —i-+ 1)e*, extending the sum given for g;(”) over all nonnegative k yields 
(n—i+3)e"** —2e(n —i + 2)e"*! 4 ce? (n —i + Le”, 


which is O for all n. The term with k = i — 1 is also 0, so 


i-2 ; 
i—l—k 
gin) = ) | — 


((n + 2)* —2e(n + 1)* + ent ) | 
k=0 


Thus g; is a polynomial of degree i — 2 with leading coefficient (1 — 2e + e*)/(i — 2)! = 
(e —1)*/(i — 2)!. 

We complete the proof using induction. We have shown that g2(x) is the constant 
(e —1)*. Suppose i > 2 and the coefficients of g;_; are positive. The constant term of g; is 
gi(0) = 72, (K —i + 1)(2* —2e)/k!, which is a sum of positive numbers. Differentiating 
the polynomial g; termwise yields 8; = g;_,. By the induction hypothesis, we conclude 
that the rest of the coefficients are also positive. 


Editorial comment. Several solvers established the generating function identity 
; 2 
oe ; OO yae¢t ree... 4+yzi7} 
> g;(n)ti = t? (3: a a ent 
é J} 
i=2 j=l 
from which it follows readily that g; is apolynomial of degree i —2 with positive coefficients. 


Solved by J. Anglesio (France), D. W. Bailey, D. A. Beckwith, D. M. Bradley, R. J. Chapman (England), C. Georghiou 
(Greece), J. H. Lindsey II, O. P. Lossers (The Netherlands), D. K. Nester, I. A. Sakmar (Turkey), J. H. Steelman, 
A. Stenger, D. B. Tyler, the NSA Problems Group, the WMC Problems Group, and the proposers. 


A Surprisingly Simple Summation Solution 


10424 [1995, 70]. Proposed by Ira Gessel, Brandeis University, Waltham, MA. Evaluate 


the sum 
0<ken/3 7 k\ 2k 


Solution I by E. Sparre Andersen and Mogens Esrom Larsen, University of Copenhagen, 
Copenhagen, Denmark. The desired sum s(n) is 2”—! + cos(nx/2). This follows from 


s(n) — 2s(n — 1) +s(n — 2) —2s(n —3) =O, (1) 
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which has the general solution s(n) = A2” + Bi" + C(—i)". Since s(1) = s(2) = 1 and 
s(3) = 4, we obtain A = B=C = 1/2. 
To establish (1), we express the sum as s(n) = u(n) + v(n), where 


n—k n—-k—1 
u(n)= > 2*( ) vay= > a-1( ) 
0<k<n/3 2k 1<k<n/3 2k — 1 

Applying the binomial recurrence yields the recurrences 
u(n) = u(n — 1) + 2v(n) and v(n) = v(n — 1) + u(n — 3). 


Repeated application of these relations shows that u and v satisfy (1), and thus their sum 
also satisfies (1). 


Solution II by Thomas Honold, Technische Universitat Miinchen, Miinchen, Germany. The 
answer is 2”—! when n is odd and 2”—! + (—1)"/* when n is even. Using the notation of 
Solution I, we form the generating function f(x) = 1+ dene 1 s(n)x". Lettingn = 3k+m 


and using (1 —x)~* = Dos (mts) em, we obtain 
3k 


1+) u(n)x” =) 2" > ("y oI x =p 


n>1 k>0  n>3k k>0 


3k 
n— Xx 
Toma = rey ("Et = ae. 
n>] k>1 n>3k 2k — k>1 (1 — x) 


Thus f is the sum of two geometric series. We obtain 


1 (1 — x)? 1 2x3 
f(x) = “(W—xr)2 9x3 9 (1x)? —9x3 
1—x (1—x)* —2x 2 — — 2x 
l-x+x° 1 
— _ 1)"x 2n qny n+1 
(1—2x)(1+x2) Tox = ) +2 


n>0 


Solution III by Donald E. Knuth, Stanford University, Stanford, CA. Let t(n,k) 
= 2k Con ‘yn /(n—k). The algorithm of Gosper and Zeilberger (explained in Concrete Math- 
ematics, 2nd edition, by R. L. Graham, D. E. Knuth, and O. Patashnik (Addison-Wesley, 
1994)) quickly yields 


2t(n,k) —t(n+1,k)4+2t(1+2,k) —t(n+3,k) = 2t(n,k) —2t(n,k — 1). 
Summing over 0 < k < n yields 2s(n) — s(n + 1) + 2s(n + 2) — s(n + 3) = O when 


n > 1, since the right-hand side telescopes to 2t(n, n — 1) — 2t(n, —1) = 0 — O and since 
t(n,k) = 0 forn/3 < k <n. This is the recurrence (1), and the solution follows. 


Editorial comment. Michael Hoffman considered a more general summation. When n, 
q, and p + q are positive integers, let Fy p(t) = DL0<k<n/(p+q) ph ("P). With 
Fo, p,q(t) = 1, Hoffman proved that 


oS n  (L—x)97! + (p/q)txPt4 
YS Fn,p.q (tx = dxf txt 


He thus obtained Fy, ~1,2(—2) = cos(nz/2). John Henry Steelman obtained F,,1.;(1) = 
Ln, where the Lucas numbers {L,,} are defined by the Fibonacci recurrence with L; = 1 
and Lo = 3. 

Solved also by J. Anglesio (France), J. C. Binz (Switzerland), J. T. Bruening, D. Callan, E. R. Canfield, R. J. Chapman 
(U. K.), D. A. Darling, S. B. Ekhad, M. Hoffman, L. N. Howard, W. P. Johnson, D. Krug, J. H. Lindsey II, O. P. Lossers 
(The Netherlands), K. MclInturff, I. Nemes & P. Paule (Austria), C. R. Pranesachar (India), R. Richberg (Germany), 
E. Schmeichel, J. H. Steelman, A. N. ’t Woord (The Netherlands), M. Vowe (Switzerland), NSA Problems Group, and 
the proposer. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


Mathematical Circles (Russian Experience) by Dmitri Fomin, Sergey Genkin, and Ilia 
Itenberg. Translated from Russian by Mark Saul. American Mathematical Society, 


1996, 272 pp, $29.00 


Reviewed by Andre Toom 


I shall call this book Circles for short. Circles is written by three Russian 
mathematicians with whom I have much in common. For many years we partici- 
pated in similar activities including organization of mathematical competitions and 
teaching informal classes called “circles” in Russia. Six years ago I moved to the 
United States and worked at several universities. Based on this experience I want 
to reflect on how Circles can be used in America. 

First of all, it is a rich collection of good problems. In addition there are useful 
notes for teachers. I think that Appendix A, which describes several types of 
mathematical contests, will be especially interesting for those who are dealing with 
all forms of cooperative learning. To get some taste of the book, let us look at a 
few problems. 


Problem 1 on page 1 

A number of bacteria are placed in a glass. One second later each bacterium 
divides in two, the next second each of the resulting bacteria divides in two again, 
et cetera. After one minute the glass is full. When was the glass half-full? 


Some students propose a half-minute as the answer, implicitly assuming that the 
growth is linear. This problem shows in a dramatic way how different is exponen- 
tial growth from linear. 


Problem 10 on page 172 

An evil king wrote three secret two-digit numbers a, b, and c. A handsome prince 
must name three numbers X, Y, and Z, after which the king will tell him the sum 
aX + bY + cZ. The prince must then name all three of the King’s numbers. 
Otherwise he will be executed. Help him out of this dangerous situation. 


It may seem at first that it is impossible to determine three variables from one 
equation and that the prince is doomed, but in fact he can save himself. The 
solution is based on a fruitful idea that can be used to introduce the students into 
the basics of information theory. 

Circles may be very useful wherever there are classes devoted to solving 
non-standard problems. There may be more such classes than we are aware of at 
some selective schools. But Russian circles were not selective in any formal sense; 
in fact, anybody might drop in. So I did about forty years ago: I simply took a 
trolleybus, went to the old university campus in downtown Moscow and started to 
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attend informal classes taught by students of Moscow University. I submitted no 
formal application, paid nothing, and got no grades, but there I became a 
professional mathematician. Most classes discussed olympiad-style problems, but 
our teacher, Sasha Olevskii, was keen on epsilon-delta arguments of mathematical 
analysis, and for me it was just the right thing. Later I taught circles for decades, 
and now I meet my former students at various places, including Boston and Tel 
Aviv. 

However, I think that the possible use of Circles is wider. I cannot imagine 
mathematical education without teaching students to solve mathematical problems. 
I am astonished when educators discuss some special “problem-solving approach” 
to teaching mathematics as if the opposite approach ever made sense. Universities 
offer special “problem-solving” courses. Does this mean that other courses do not 
teach to solve problems? Anyway, “problem-solving” courses should be welcomed, 
and Circles deserves a very prominent place among candidates for textbooks for 
such courses. In addition, many universities offer courses in mathematics without 
specifying exactly what these courses are about. When I teach such courses, I try to 
teach my students to solve as many problems as possible. It was very difficult for 
me to find a textbook for that. Now I have found at least one: Circles. 

At this point I must admit some inconsistency. I have to agree with the 
translator’s disclaimer: “This is not a textbook.” At the same time I am going to use 
Circles as a textbook. The point is that it is an unusual textbook. It is unusual in 
several respects, but let us concentrate on the following. First, most problems in 
Circles need a rigorous approach. Some of them explicitly require proofs, while 
others ask questions whose answers involve argumentation. Second, most problems 
in most textbooks can be solved in a way that is explained in advance. Many such 
problems have identical mathematical structure presented in different “real world” 
guises. All this is absent in Circles. Most problems need a new idea although 
problems are grouped and some general theory is explained. 

Now we approach a very important notion: transfer of training. Let us imagine 
the set of all possible problems in a branch of mathematics as a metric space. 
Every particular problem is a point in this space and similar problems are close to 
each other. By discussing a problem with our students, we cover a sphere with the 
center at this problem and radius equal to our students’ ability to transfer their 
training from this problem to similar problems. Our purpose is to cover the 
greatest possible space with these spheres. Is transfer of training possible? For the 
authors of Circles, for me, and for all who have ever taught in this style the answer 
is evident: “yes, of course, transfer of training is possible and it is closely related to 
another precious human ability: generalization.” 

But some American educators have questioned the importance or the very 
possibility of transfer of training. Instead they proposed to solve only such 
problems in class that people face in everyday life. These educators have sent 
many messages to the effect that students are not interested in problems that have 
no immediate real-world use. As a result, some students’ ability for generalization 
and transfer of training is almost completely atrophied. As soon as a problem on a 
test slightly deviates from problems solved in advance, they complain, “We did not 
solve such problems.” Whichever course these students take, they learn to solve 
only those particular problems that the teacher chose to explain. Their spheres 
have radii that are close to zero and their total measure is also close to zero. It is a 
safe bet that the problems they have to solve after graduation practically never 
coincide with those few they have learned to solve in class. 
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All this does not mean that the manner in which problems are formulated is 
unimportant. It is very important and we may reproach Circles for some neglect in 
this respect. Let us illustrate these ideas by some examples. 


Problem 11 on page 53 

A woodman’s hut is in the interior of a peninsula which has the form of an acute 
angle. The woodman must leave his hut, walk to one shore of the peninsula, then 
to the other shore, then return home. How should he choose the shortest such 
path? 


This is a very useful problem. It introduces students to the important realm of 
geometrical transformations. It also gives the teacher a chance to speak about 
relations between mathematics and physics because light also “chooses” the 
shortest path. Regrettably, no motivation is given why the woodman should walk in 
such a strange way. In Steinhaus’s book a similar problem is formulated more 
vividly: ““An Arab wishes to return to his tent, but on the way he wants to feed his 
horse and draw water from a river.” [2, pp. 111-112]. Note that my criticism has 
nothing to do with the silly requirements of straightforward “real-world” relevance. 
The real relevance is through theory, as usual. Another example: in problem 22 on 
page 3 there is a strange river that makes a 90° turn. In Kordemsky’s book [1, 
problem 126 on p. 54] it was a moat surrounding a fort, which was more plausible 
and romantic. 

Since I started to criticize Circles, let me continue to do it from my point of view 
as a teacher. Circles does not mention irrational numbers at all, not even the 
famous proof by contradiction that no square of a rational number equals 2. It is 
very easy to collect a series of problems in this vein. (This is what I do in my 
classes.) Why didn’t the authors of Circles do it? 

Some other important facts are present but not emphasized. For example, it 
takes attentive reading to find problem 53 on page 29, which is included into a 
section called “A few more problems” as if it were something optional: Prove that 
there are infinitely many prime numbers. Nothing is said about the importance of this 
fundamental fact. 

On page 177 the authors explain one method to invent problems, which they 
call “inequalities a la Leningrad.” A typical example, problem 1 on page 175: 
Which number is greater: 31" or 17'4? Problems of this sort seem to have been a 
useful contribution to olympiads in the past, but I am afraid that the presence of 
calculators kills most of them. On the other hand, look at problem 43 on page 161: 
If all the sides.of a triangle are longer than 1000 inches, can its area be less than 1 
square inch? This problem and several similar ones were invented in the Moscow- 
based School by Correspondence, and these problems remain usable in presence 
of any technology. 

Problem 10 on page 23 assumes acquaintance with divisibility tests for 3 and 9, 
which are introduced only in problem 31 on page 99. The same seems to be true 
for problem 76 on page 72; I could solve it only using divisibility tests. There is 
some confusion with problem 65 on page 71. The letter M is misplaced and K is 
absent in figure 123 on page 163. 

I admit another inconsistency about students’ age. Circles is addressed to 12-14 
year old children. This contrasts with the way I am going to use Circles. All of my 
students are intelligent and motivated, but they are around twenty years old or 
more. Sometimes I wonder what have they been doing for so many years. Now they 
have to think about graduation and making a career. They are pressed by time and 
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money concerns, some have to do odd jobs, some have children. Of course, all this 
interferes with their study. It would be much better for them to have started to 
solve non-trivial problems years ago. 

This inconsistency is especially noticeable when one reads the initial Chapter 
Zero, intended for students of ages 10-11. “The problems of this chapter have 
virtually no mathematical content,” naively claim the authors. Actually the prob- 
lems of this chapter require the most fundamental ability—the ability for abstract 
thinking. This ability should by no means be taken for granted; it develops as a 
result of careful and well-thought schooling. Let us remember that most teachers 
of circles in Russia were not professional teachers. In most cases they were 
university students, inspired but unexperienced. Their teaching was successful due 
to the sound preparation provided by the national educational system. In my young 
years the Russian educational administration seemed very stupid to me, but now I 
see how efficient actually it was. Don’t ask me how does this square with the 
tyrannical Soviet rule and the ailing Russian economy because I don’t have all the 
answers. 

Some people ask whether there are competent enthusiasts in America who 
could and would teach classes of creative problem solving. The answer is “yes, of 
course,” but this is not the right question to ask. The right question is whether the 
educational system can teach the basics of mathematics so that children will be 
able to attend such classes. 


REFERENCES 


1. Boris A. Kordemsky, The Moscow Puzzles, 359 Mathematical Recreations. Dover Publications, 1992. 
2. H. Steinhaus, Mathematical Snapshots. New York, Oxford University Press, 1969. 


Department of Mathematics 
University of the Incarnate Word 
4301 Broadway 

San Antonio, TX 78209 

toom @the-college.iwctx.edu 


Vita Mathematica: Historical Research and Integration with Teaching. Edited by Ronald 
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Reviewed by Hardy Grant 


That mathematics has an “image problem” with large segments of the general 
public has been obvious for a long time. The stock indictment is all too familiar, 
for it has echoed repeatedly in the cultural creations that reflect and shape our 
collective values. Since Aristophanes, mathematics and its devotees have been 
lampooned, gently or savagely, as narrow, austere, mechanical, passionless, exces- 
sively cerebral, cold. Examples abound. Jane Austen’s Emma (1816) considers at 
one point that the romantic events unfolding around her must impinge on “the 
coldest heart and the steadiest brain,” on a linguist, a grammarian, “even [!] a 
mathematician.” The “Master Mathematician” in Oscar Wilde’s The Happy Prince 
(1888) “frowned and looked very severe, for he did not approve of children 
dreaming.” Have perceptions changed in our own time? A few months ago, the 
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parents in the enjoyable comic strip Sally Forth, weighing possible summer destina- 
tions for their 9-year-old, threatened her with “Algebra Camp” (“just outside Geek 
City”) as the absolute antithesis of a place where a kid could hope to have some 
fun. Readers of the MONTHLY could multiply these instances at will. The problem 
runs very deep, and does not look like going away any time soon. 

Is there in fact a cure? Countless adults blame their distaste for mathematics on 
bad teaching, so the schools would seem the natural place to start. But what to do 
there? For many the best hope remains the use of history in the mathematics 
classroom as a “humanizing” corrective. The Danish educator Torkil Heiede goes 
so far as to say, in his contribution to the volume under review, that to teach 
mathematics “without its history” is to teach it “as if it were dead.” The book’s 
origin reflects the now widespread sharing and institutionalization of such senti- 
ments. Many of the papers assembled here began as talks given in the summer of 
1992 at either the Quadrennial Meeting of the International Study Group on the 
Relations between History and Pedagogy of Mathematics (“HPM”), at Toronto, or 
the Seventh International Congress on Mathematical Education (“ICME”), at 
Quebec City. According to Ronald Calinger, the volume’s editor, other papers are 
the fruits of a call sent out by him to “historians of mathematics.” The book’s 
subtitle, “Historical Research and Integration with Teaching,” captures well enough 
such focus and unity as this wildly diverse collection may be said to possess. 
Calinger identifies the primary audience as “mathematics teachers, research 
mathematicians, historians of mathematics, and historians of science” (p. vii). 

Predictably, these thirty articles vary enormously in subject matter, in level of 
presentation, and in putative audience. Someone, presumably the editor, has 
gamely tried to partition them according to theme, but the subsets’ boundaries are 
rather slippery, and indeed I would argue that perhaps half a dozen contributions 
have ended up in the wrong boxes. I shall adopt a somewhat different grouping in 
what follows. I should add that any attempt to summarize so many papers in so 
brief a compass must be lamentably selective and superficial, and due apology is 
hereby extended both to their authors and to potential readers. 

A preliminary word about the book’s “production values” may be in order. The 
index is good, and in many articles the bibliography alone is worth the price of 
admission. Many marvellous illustrations have been culled from inaccessible 
sources. The writing is mostly very competent, if occasionally clumsy and seldom 
inspired. Surprisingly or not, some of the authors for whom English is presumably 
a learned language write it better than do some presumably native users. Alas, 
these pages will not reassure those readers (I am one) who fear that in our time 
professional standards of editing and proofreading are in steep decline. Thus (to 
offer a sampling that is very far from exhaustive) “data” is stubbornly treated as 
singular (pp. 92, 335), and “phenomena” too for good measure (p. 332); two verbs 
on p. 94 disagree with their respective subjects; several references of a paper to 
itself have been left as “p. ???”; and computer gremlins have at several places been 
allowed to substitute commas for occurrences of “é.” (One might have thought that 
“Dieudonn,” (pp. 9, 10) would rouse the sleepiest proofreader; but apparently not.) 
It is only fair to add that the passages involving technical mathematics seem to 
have been treated with commendable care. 

Not much in these pages will jolt readers out of their seats, or spark revolutions. 
Few seeds of potential controversy are here sown. But one paper, David Rowe’s 
fine survey of trends in the historiography of mathematics, deals at some length 
with old disagreements that still smolder; I shall come back to that paper, and 
those issues, toward the end of the review. The author here whom I should most 
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enjoy debating face to face is Ubiratan D’Ambrosio, who offers a spirited defence 
of an “ethnomathematics” conceived so broadly as to coincide, so far as I can see, 
with all of cultural anthropology (“the study of techniques developed in different 
cultures for explaining, understanding, and coping with their physical and socio- 
cultural environments”). Regrettably, a definition so vague and idiosyncratic seems 
likely to take the paper out of the mainstream discussion of its ostensible subject. 

One of the book’s great delights is its several ventures along relatively untrod- 
den byways of our mathematical heritage. Complete with charming cartoons (by 
the author?), Beatrice Lumpkin’s article cleverly interweaves the history of the rule 
of false position (which she traces to ancient Egypt) and the career of the 
outstanding mathematics educator Benjamin Banneker (d. 1806). Peggy Kidwell 
tells the absorbing story of the various objects, especially geometric models, that 
saw service as teaching aids in 19th-century schools and universities. Kidwell links 
these long-ago educators’ enthusiasm for such objects to their belief that “it was 
more useful to show pupils the properties of surfaces and solids using models than 
to offer formal proofs of those properties’—a notion that has not lost its validity. 
Karen Dee Michalowicz rescues from obscurity Mary Everest Boole (the mountain 
is named for her uncle), wife of George Boole and a distinguished writer on, and 
teacher of, mathematics in her own right. Those who still doubt that women have 
“come a long way” in our subject may like to ponder a despairing remark made by 
Mary’s father in 1842, when she was ten. If she could go to university, he said 
(which she could not), she “would carry everything before her ... But what could 
a girl do learning mathematics?” This article includes two-plus pages of short 
quotations from Mary Boole’s writings on the art of teaching, a little treasury that I 
wouldn’t trade for ten years’ worth of “aims and objectives” from my local Ministry 
of Education. 

Vita Mathematica contains a number of papers on “straight” history of mathe- 
matics, without overt or obvious pedagogical application. Jens Hgyrup traces from 
old Babylonia to the Renaissance the career of a single problem—to find the side 
of a square from the sum of its perimeter and area; this, he says, turns out to 
belong to a non-scholarly tradition of practical geometry, whose interaction with 
the contemporary “literate” mathematical culture he discusses at length. Wilbur 
Knorr’s theme is the ancient practice of the “method of indivisibles” and its 
influence in the age of Cavalieri. Knorr argues that an “indivisibilist heuristic” 
predated Archimedes, and that early-modern versions represent “a reconstruction 
of the lost heuristic by geometers who were impatient over the demands of formal 
demonstration.” These two papers are substantial monographs, with the detail and 
rigor of argument and documentation that we expect from their authors. Shorter 
and slighter, but perhaps on that account better suited to “integration with 
teaching,” are an overview of traditional Chinese mathematics by Frank Swetz, a 
discussion of combinatorics and induction in medieval Hebrew and Islamic mathe- 
matics by Victor Katz, and Barnabas Hughes’ description of the earliest correct 
algebraic solution of cubic equations, by Master Dardi of Pisa (c. 1350). All three 
of these essays are solid and useful. Zarko Dadié well summarizes the work of the 
Croatian scientist Marin Getaldi¢ (1568-1626). Getaldi¢ wrote inter alia a treatise 
De resolutione et compositione mathematica—which is to say, the use in mathemat- 
ics of the method of “analysis and synthesis.” Some day that volume, and Dadi¢’s 
paper here, will be sought out by the author of one of the great unwritten books of 
the western intellectual tradition, a history of analysis and synthesis in their two 
millennia of life, not just in mathematics but in philosophy and early-modern 
science as well. 
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The other purely historical essays in Vita Mathematica take the reader much 
nearer to our own time. Roger Cooke gives an admirable account of Sof’ya 
Kovalevskaya’s “mathematical legacy,” in particular her “discovery of a physical 
configuration for which the equations of motion of a rigid body about a fixed point 
under the influence of gravity can be integrated in closed analytic form.” William 
Aspray, Andrew Goldstein, and Bernard Williams entertainingly recount the rise 
of theoretical computer science and engineering, in both their intellectual and 
social aspects but with emphasis on the latter, specifically the supporting role of 
the National Science Foundation. 

Other papers treat of the history of mathematics education. Hans Niels Jahnke, 
in one of the book’s best essays, relates the complicated 19th-century history of 
“algebraic analysis’—the most familiar manifestation is probably Lagrange’s re- 
casting of the calculus in terms of power series—and explains how its vogue had 
the remarkable effect of excluding infinitesimal methods from the curricula of 
German gymnasia for several decades. Jahnke argues that pedagogically this was 
no bad thing, for it entailed a stress on “concrete” problem-solving as opposed to 
blind application of algorithms. Susann Hensel’s topic is the mathematical educa- 
tion of engineers in late-19th-century Germany. As she says, some of the questions 
then under debate are hardy perennials: how “pure” and rigorous should the 
mathematics in “service” courses be, and should they be taught only by mathemati- 
cians? Ronald Calinger focusses on the early years (1861 ff.) of the famous 
mathematics seminar at Berlin; the midwives at its birth were Kummer and 
Weierstrass. Calinger weaves into his tale a good deal of the mathematics of these 
two giants and of their contemporaries. 

The papers in yet another group get closer to the volume’s professed goal of 
integrating history with teaching. Fred Rickey makes the case in general terms, 
describes particular techniques in his own practice, and calls attention to such 
resources as the HPM newsletter and the MAA’s thriving e-mail discussion group 
on the history of mathematics. (Rickey is too modest to mention here that the 
latter is his own creation.) Torkil Heiede relays the welcome news that the Danish 
government has mandated the use of the history of mathematics in “upper grades.” 
One longs to know more: who had the ears of the bureaucrats, and what 
arguments were used that might be transplanted to less progressive jurisdictions? 

A repeated theme in these pages is the value of history in promoting a view of 
mathematics as a process, rather than merely a product, of human striving and 
discovery. This contrast explicitly guides Evelyn Barbin’s persuasive advocacy, with 
historical examples, of a problems-oriented approach to teaching. The benefits, 
she says, include a better understanding and tolerance of pupils’ errors. Similarly 
Manfred Kronfeller, after providing a short history of the function concept, draws 
pedagogical lessons, that include the realization that student errors may actually 
mimic those of the great masters. Indeed, Kronfeller says, in teaching any set of 
ideas one should follow as much as possible their historical evolution, for one can 
“assume” that aspects elucidated earlier in history should be easier for pupils to 
grasp. Some empirical evidence in the same direction is offered by Peter Bero, who 
conducted in Slovakia a survey of young people’s perceptions of the continuum. 
(The nodding proofreaders here provide the funniest of the book’s innumerable 
typos: the age range of Bero’s respondents is said (p. 304) to be “1 to 18.” So what 
does the playpen set make of Zeno?) Bero reports that these elementary-school 
and gymnasium students conceive the continuum in ways “often similar to those 
displayed by ancient Greek mathematicians.” One thinks of the biologists’ catchy 
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old saying that “ontogeny recapitulates phylogeny’—the individual’s development 
retraces the evolution of her species. 

Other authors seek to base teaching on the direct use or creative reworking of 
primary sources. Richard Laubenbacher and David Pengelley describe an upper- 
level honors course that presents students with “mathematical masterpieces,” 
ranging in time from Archimedes to John Conway, and asks them to function as 
“critics” in the sense in which this term is used in the arts. Israel Kleiner outlines a 
course, for teachers, built around the use of quotations about mathematics, 
supplemented by a “very concise” chronology. Very instructive here are a number 
of pairs of mutually opposing quotations, which give a vivid sense of the complexity 
of issues and of their inherent drama. This potential for drama is taken to its 
natural conclusion by Gavin Hitchcock, who has written stageable dialogue in 
which figures from the history of mathematics debate their respective stances. In 
the first of these playlets Simon Stevin touts irrational numbers against the qualms 
of Michael Stifel; in the second, set in 1827, George Peacock and Augustus de 
Morgan try to make William Frend (himself a mathematician) accept multiple and 
negative roots of equations. Given the right actors and audience, these exchanges 
could “come off the page” with real theatrical effectiveness. 

Several authors present case studies that extract pedagogical morals from 
specific historical episodes. John Fauvel’s point of departure is a wonderfully 
curious “tree” diagram, from a book of 1808, which shows graphically the various 
factors that led to the abolition of the slave trade. Fauvel’s agenda here is 
threefold. He argues that (i) “graphical representation and modelling” have been 
neglected by historians of mathematics, (ii) they have also been “devalued” by 
teachers, in comparison with “prose text or algebraic symbolism,” and (iii) they can 
and should be used to “empower” students, that is, to encourage in students the 
“knowledge and belief that they can use and create mathematics to influence their 
way in the world.” Marie Francoise Jozeau and Michéle Grégoire describe the 
meridian measurements in France (1792-99) that led to the first definition of the 
meter, and tell also (a bit cursorily, to my regret) of an imaginative reenactment, 
near Paris, of part of that labor by a group of modern high school students. Jim 
Tattersall sketches the history of attempts to estimate the total number of people 
who ever lived, and proposes a classroom exercise to the same end. This piece 
revives an ancient wheeze which is the best joke in a book not brimming with 
humor. A class of students was asked how they knew they were going to die, and 
one replied brightly that “it was because so far most people have.” 

Several other papers (in addition to Tattersall’s) draw their case studies from 
the history of the calculus. Judy Grabiner summarizes the sharply contrasting 
approaches taken to the calculus by Maclaurin (geometric) and by Lagrange 
(algebraic), respectively. That is a familiar story, but Grabiner goes on to trace the 
difference to differing cultural influences, and sets out the lesson for teachers: the 
tendency of students to adopt diverse approaches to mathematical problems is 
both natural and legitimate, especially where “non-traditional” backgrounds are 
involved. Martin Flashman suggests that instructors give an extra historical dimen- 
sion to the textbook account of the Fundamental Theorem of the Calculus by 
presenting in detail the wholly geometric version proved by Isaac Barrow just 
before the watershed work of Newton and Leibniz. Man-Keung Siu considers 
“integration in finite terms” from its first rigorous treatment by Liouville (1830s) to 
the modern classroom. He gives splendid expositions both of the mathematics itself 
—which is far from elementary—and of its history, and in both aspects he 


1997] REVIEWS 475 


maintains an exemplary balance between superficiality on the one hand and 
excessive detail on the other. And Siu is equally good on the pedagogical issues. 
He has a fine sense of the kind of question that better students are likely to raise, 
and he specifically addresses the impact of computers on the teaching of his 
chosen topic. For my money his paper is another of the volume’s highlights. 

I come finally to the excellent essay, already cited above, in which David Rowe 
depicts “new trends and old images” in the historiography of mathematics. In 
particular he discusses the notorious debate that climaxed in the late 70s over the 
“geometrical algebra” commonly credited to the Greeks. Many results stated and 
proved in geometrical language by Euclid can easily be “translated” into elemen- 
tary algebraic identities; but does such restatement distort the Greeks’ own point 
of view? So argued the historian Sabetai Unguru, whose views then evoked 
rebuttals from mathematicians of the stature of André Weil, Hans Freudenthal, 
and B. L. van der Waerden. Rowe expounds this particular dispute with applaud- 
able fairness, and then sets it in a larger context, centering his discussion around 
some provocative views voiced by the same André Weil in a famous lecture in 
1978. Weil declared on that occasion that for mathematicians the “first use” of the 
subject’s history is “to put or keep before our eyes ‘illustrious examples’ of 
first-rate mathematical work.” Therefore “the craft of mathematical history can 
best be practiced by those of us who are or have been active mathematicians or at 
least who are in contact with active mathematicians.” Moreover, Weil urged, it is 
appropriate, indeed necessary, to interpret the mathematical ideas of past cultures 
in modern terms. For example, “it is impossible for us to analyze properly the 
contents of Books V and VII of Euclid [on ratios of magnitudes and on number 
theory, respectively] without the concept of group and even that of groups of 
operators, since the ratios of magnitudes are treated as a multiplicative group 
operating on the additive group of the magnitudes themselves.” 

The tendency exhibited in this last quotation is probably more seductive in 
mathematics than in any other subject. For in mathematics, as the same quotation 
shows, many ancient ideas can be fitted with perilous ease into modern abstract 
frameworks to which they are wholly foreign in conception and in spirit. Perhaps 
then it was no great surprise to read recently a hint that mathematics may have 
acquired among outsiders a certain reputation for this kind of thing. In the 
obituary of Joseph Needham that she wrote for the June 1996 issue of Isis, 
Francesca Bray tossed out the aside that “projecting modern meanings onto 
ancient terms [is] a practice that often passes unnoticed, I am told, in the history of 
Western mathematics.” “Often,” indeed; for the temptation is chronic. 

But not, in, fact, “unnoticed.” It was just this tendency that Sabetai Unguru 
protested back in the 70s, and now David Rowe takes up the cause. Still 
addressing himself ‘specifically to the views of André Weil, quoted above, Rowe 
explains with great civility what “disturbs” historians about viewing ancient ideas 
through modern lenses. He associates Weil’s vision of history with a Platonist 
philosophy, which makes mathematical truths independent of time and of histori- 
cal milieu. As he says, this orientation disposes one to “present the development 
of mathematical ideas as a steadily unfolding search for Platonic truths that 
transcend the particular cultural contexts in which these ideas arose” (p. 10). But 
he raises at once, in the completion of the sentence just quoted, the obvious 
objection, that one can write history in this way “only by discounting the rich 
variety of meanings that accompanied” those ideas in their actual concrete settings. 
With this objection I agree entirely—but I would push the argument a bit further. 
There is a facet of the issue that most discussions mention scarcely if at all. 
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I suppose that those who credit the ancients with this or that modern idea think 
that the ascription amounts to generous praise. For all their limitations, the 
argument goes, these forerunners had the Right Stuff, they were spiritually so 
close to us that they seem (in Littlewood’s memorable phrase) like “fellows of 
another college”; and wouldn’t they rejoice in our good opinion of them? Well, 
maybe so. It might just be, however, that the mathematicians of bygone ages would 
also, or even rather, wish to be studied and judged for what they were, in their own 
proud individuality and distinctiveness. In the writing of history within the tradi- 
tional academic discipline of that name, the goal of an empathetic portrayal of the 
past on its own terms is now a commonplace. This is not at all to say, of course, 
that anticipations of, influences on, the present are not worth notice. But the aim 
is the polar opposite of the deliberate backward projection of modern modes of 
thought and feeling. Rather, the quest is for what William Blake would call the 
“minute particulars,” the specific and defining uniqueness, of each time and place 
and people of the past. 

Historiography so conceived has at its very heart a moral imperative, a scrupu- 
lous respect for, and honoring of, those who went before. At its highest, it is—dare 
one say?—like an act of love, which would know and possess yet ultimately leave 
singular and autonomous. That makes its practice at once a noble and a formidably 
demanding enterprise. George Steiner once wrote, in a different but parallel 
context, that “there can be no other thanks [to the creators of our cultural 
heritage] than extreme precision, than the patient, provisional, always inadequate 
attempt to get each case right.” That challenge is present no less in the historiogra- 
phy of mathematics than elsewhere, despite the (for some) supposedly eternal 
character of the truths unveiled. Indeed, in the last analysis this timelessness is a 
red herring, for the most passionate Platonist must concede that mathematical 
discovery is the work of individual minds in historically conditioned settings. 

So who should write the history of that discovery? Where the object of study is 
the technical progress made in the 20th century, André Weil’s claim that only 
mathematicians need apply seems incontestable. In most fields of current mathe- 
matics, the dynamic of the development is so overwhelmingly “internal,” and the 
barriers of specialist knowledge so forbidding, that the prospective historian 
cannot hope to succeed without something very close to the active researcher’s 
intimate grasp of, and “feel’’ for, the subject. The scarcity of people qualified in 
this sense must be the prime reason for the deplorable under-representation of the 
20th century in the journals and conference programs devoted to serious history of 
mathematics. 

But if the historiography of 20th-century mathematics must be left to the 
practitioners, the past—even the relatively recently past—is a different case. 
Roger Cooke says acutely in this volume (p. 177) that, thanks above all to the 
massive modern trend toward abstraction and generalization, 


To reconstruct precisely the present state of any nineteenth-century mathe- 
matical topic is in a sense impossible. No mathematical problem is under- 
stood exactly as it was understood ... a century ago. 


Then how much wider still the gulf between us and the still more distant past! The 
farther back in time, the more foreign and elusive must be the “mindset” that the 
historian would seek to know. Moreover, and crucially, increasing remoteness from 
the present increases also the role of “external” factors in mathematical 
activity—and so diminishes, in proportion, the place of purely mathematical skills 
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and knowledge in the aspiring historian’s stock in trade. Where ancient mathemat- 
ics is in question, the historian needs little if any technical grasp beyond the very 
elementary levels (quadratic equations or whatever) achieved by the civilization 
under her gaze. The resources that must sustain her are in other directions 
entirely. The importance of the relevant linguistic competence is obvious. The 
whole of the surrounding cultural and social matrix is germane and must be 
mastered. Perhaps the insights of modern anthropology can be brought to bear, as 
by Geoffrey Lloyd in his superb studies of ancient Greek science. Above all, the 
successful explorer of mathematics’ distant past must bring the precious gifts of 
imagination and of empathy that alone give any hope of access to alien minds. In 
this terribly difficult undertaking the research mathematician as such has abso- 
lutely no privileged status, no claim whatever to special authority. 

The good things in the book under review, and there are many, make their own 
valuable contributions to the history of mathematics and to the creative use of that 
history in our classrooms. Thus they may serve also toward fulfilling the hope, 
articulated here by Fred Rickey, of educating “‘a general population with a much 
better feel for what mathematicians do and why it is important.” In that urgent task 
Vita Mathematica will not set the world on fire, but it should light some candles; 
and every bit helps. 
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General, T*(13-14: 2). Principles and Prac- 
tice of Mathematics. COMAP. Springer-Verlag, 
1997, xi+ 686 pp, $64.95. [ISBN 0-387-94612- 
8] A thoughtful alternative to calculus as an 
entry point to college mathematics; same level 
and prerequisites as first-year calculus. Pre- 
pares students for more advanced studies in 
mathematics and complements the calculus se- 
quence. Topics: sequences and difference equa- 
tions, vectors, some analytic geometry, linear 
programming, basic linear algebra, combina- 
torics, graphs and algorithms, logic, probability 
and decision theory, symmetry and permutation 
groups, coding. Extensive applications. Gives 
students both specific tools and a sense of the 
breadth of mathematics used in scientific and 
industrial settings. LB 


Recreational Mathematics, S**(13—16), P, 
L**, From Erdds to Kiev: Problems of 
Olympiad Caliber. Ross Honsberger. Dol- 
ciani Math. Expos., No. 17. MAA, 1996, xii + 
257 pp, $31 (P). [ISBN 0-88385-324-8] Only 
a master expositor could take these problems, 
mostly from the 1987 and 1988 volumes of Crux 
Mathematicorum, and present them as works of 
art with direct appeal to general readers. The 
author is quick to draw attention to subtleties 
that make the problem interesting; his solutions 
are exquisitely clear and easy to read, instruc- 
tive and pleasurable. LCL 


Recreational Mathematics, S**, L*. Five 
Hundred Mathematical Challenges. Edward 
J. Barbeau, Murray S. Klamkin, William O.J. 
Moser. Spectrum Ser. MAA, 1995, xi + 227 pp, 
$29.50 (P). [ISBN 0-88385-519-4] These 
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1-4: Semester 

** * Special Emphasis 

?? : Questionable 
Selected books 


problems, first published in a series of booklets 
almost twenty years ago, span all areas of high 
school mathematics (pre-calculus), and range in 
difficulty and sophistication from easy puzzles 
to problems at the Olympiad level. A useful ap- 
pendix summarizes essential knowledge (com- 
binatorics, arithmetic, algebra, inequalities, ge- 
ometry and trigonometry, analysis). A short 
index classifies problems by subject. LCL 


Recreational Mathematics, S*(16—-18), P, 
L**, Contests in Higher Mathematics: Miklos 
Schweitzer Competitions 1962-199]. Ed: 
Gabor J. Székely. Problem Books in Math. 
Springer-Verlag, 1996, vii + 569 pp, $59. 
[ISBN 0-387-94588-1] Questions, with de- 
tailed solutions, from this unique and presti- 
gious Hungarian mathematics take-home exam. 
(A big step beyond the Putnam; students may 
use materials available in libraries or homes, 
and have ten days to prepare their solutions). 
Topics include algebra, combinatorics, theory 
of functions, geometry, measure theory, number 
theory, operators, probability theory, sequences 
and series, topology, and set theory. LCL 


Recreational Mathematics, S**, L**. Lenin- 
grad Mathematical Olympiads, 1987-1991. 
Dmitry Fomin, Alexey Kirichenko. Contests 
in Math., V. 1. MathPro Pr, 1994, xix + 197 pp, 
$24 (P). [ISBN 0-9626401-4-X] An out- 
standing collection of inviting challenge prob- 
lems graded by level of difficulty from middle 
school to high school. Many are suitable for 
undergraduate ‘“‘Problems-of-the-Week”’ recre- 
ations (arithmetic, algebra, combinatorics, dis- 
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crete mathematics, geometry, functions, begin- 
ning analysis). LCL 

Recreational Mathematics, S**, L**. The 
Universe in a Handkerchief. Martin Gard- 
ner. Copernicus (Springer-Verlag), 1996, x 
+ 158 pp, $19. [ISBN 0-387-94673-X] A 
friendly, informative introduction and guide to 
the ingenious and intriguing games, puzzles, 
and word plays of Lewis Carroll. LCL 


Recreational Mathematics, S, L. Rediscov- 
ered Lewis Carroll Puzzles. Ed: Edward Wakel- 
ing. Dover, 1995, xiii+ 79 pp, $4.95 (P). [ISBN 
0-486-28861-7] Forty-two recreational puz- 
zles and games (with solutions) used by Lewis 
Carroll to entertain friends, colleagues, and 
“children of ages five to ninety-five.’ LCL 


Recreational Mathematics, S, L. ARML- 
NYSML Contests, 1989-1994. Lawrence Zim- 
merman, Gilbert Kessler. Contests in Math., 
V. 2. MathPro Pr, 1995, xvii + 189 pp, 
$19.95 (P). [ISBN 0-9626401-6-6] <A sequel 
to the ARML-NYSML Contest Book 1973-1985 
published by NCTM in 1987. These contest 
problems for high school students range from 
short answer questions to challenging multi-part 
problems requiring in-depth analysis and origi- 
nal thinking. All problems are original; instruc- 
tive solutions. LCL 


Recreational Mathematics, S. Collection of 
Problems on Smarandache Notions. Charles 
Ashbacher. Erhus Univ Pr, 1996, 73 pp, 
$8.25 (P). [ISBN 1-879585-50-2] Collection 
of open problems in number theory, mostly of 
the following sort: Which elements of a given 
sequence § have property P? For example: 
Which triangular numbers, with digits x;, can be 
“partitioned,” forsome 1 < k < m <n, into the 
form x1X2...XEXK41 ..-XmXm41---Xn so that 
XpXQ...XptXE1 Xm = Xm41---Xn? LCL 


History, L. The Way I Remember It. Walter 
Rudin. History of Math., V. 12. AMS, 1997, ix 
+ 191 pp, $29. [EISBN 0-8218-0633-5] Wal- 
ter Rudin’s memoirs. Includes samples of his 
work. Written for non-analysts. LC 


History, L*. Poincaré and the Three Body 
Problem. June Barrow-Green. History of 
Math., V. 11. AMS, 1997, xvi + 272 pp, $49. 
[ISBN 0-8218-0367-0] Account of Poincaré’s 
memoir on the three-body problem from a math- 
ematical and historical perspective. Also dis- 
cusses earlier work by other mathematicians, 
reactions by Poincaré’s contemporaries, and the 
memoir’s influence on later work. LC 


Foundations, T(16—17: 1), L. Intermediate Set 
Theory. F.R. Drake, D. Singh. Wiley, 1996, x 
+ 234 pp, $29.95 (P). [ISBN 0-471-96496-4] 
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Good intermediate level treatment of set theory 
including ZFC, first-order logic, cardinals and 
ordinals, axiom of choice, constructible sets and 
forcing, the standard paradoxes, and the devel- 
opment of mathematics within ZFC. RM 


Combinatorics, P. Matroid Theory. Eds: 
Joseph E. Bonin, James G. Oxley, Brigitte Ser- 
vatius. Contemp. Math., V. 197. AMS, 1996, 
xii + 418 pp, $72 (P). [ISBN 0-8218-0508-8] 
Proceedings of a 1995 AMS-IMS-SIAM Joint 
Summer Research Conference at the University 
of Washington. 


Discrete Mathematics, S(14-15). Exploring 
Discrete Mathematics With Maple. Kenneth H. 
Rosen, et al. McGraw-Hill, 1997, viii + 392 pp, 
$20.97 (P). {ISBN 0-07-054128-0] Guide to 
Maple’s discrete mathematics capabilities. As- 
sumes no prior experience with Maple. Includes 
discussion and examples of built-in commands 
as well as procedures written for the book. LC 


Number Theory, T(15—17: 2), S, P, L. Number 
Theory: An Introduction. Don Redmond. Pure 
& Appl. Math., V. 201. Marcel Dekker, 1996, 
xii + 749 pp, $175. [ISBN 0-8247-9696-9] 
Lots of exercises and problems, with optional 
computer investigations, accompany this ex- 
pansive, instructive, and informative text. The 
first half contains material for an undergradu- 
ate introduction (primes, divisibility, congru- 
ences, quadratic residues, Diophantine equa- 
tions); the second half demonstrates, in a self- 
contained way, how other areas of mathematics 
enter into the study of natural numbers (contin- 
ued fractions, fractions), arithmetic functions, 
Bertrand’s Postulate, the Chebychev Theorem, 
the prime number theorem, and algebraic num- 
ber theory. LCL 


Number Theory, P. Sets of Multiples. Richard 
R. Hall. Tracts in Math., V. 118. Cambridge 
Univ Pr, 1996, xvi + 264 pp, $59.95. [ISBN 
0-521-40424-X] Asymptotic analysis, mostly 
via sieve methods, of number theory problems 
that can be phrased in terms of sets of integers 
for which any integer multiple of an element is 
also in the set. DB 


Number Theory, $*(15-18), P*, L**. The- 
ory of Algebraic Integers. Richard Dedekind. 
Transl: John Stillwell. Math. Lib. Cambridge 
Univ Pr, 1996, vii + 158 pp, $22.95 (P). [ISBN 
0-521-56518-9] A translation of Dedekind’s 
1877 memoir explaining his ideal theory to a 
general mathematical audience. Extensive in- 
troduction by John Stillwell describes the his- 
tory and problems of the study of algebraic inte- 
gers. Accessible to good undergraduates. DB 


Linear Algebra, S(14-16), L. Linear Algebra: 
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Challenging Problems for Students. Fuzhen 
Zhang. Stud. in Math. Sci. Johns Hopkins 
Univ Pr, 1996, ix + 174 pp, $14.95 (P); $35. 
[ISBN 0-8018-5459-8; 0-8018-5458-X] 200 
problems with hints and solutions. Problems 
range from easy to difficult; some are standard 
problems found in any text. LC 


Linear Algebra, T(15-18: 1). Linear Algebra 
with Applications. John T. Scheick. Intern. Ser. 
in Pure & Appl. Math. McGraw-Hill, 1997, xv 
+ 432 pp, $57.75. [ISBN 0-07-055184-7] An 
extraordinary compilation of topics and applica- 
tions from linear algebra both inside and outside 
mathematics. Well-written; highly mathemati- 
cal. A worthy addition to your library. PF 


Algebra, T(15-16: 2), L. Fundamentals of Ab- 
stract Algebra. D.S. Malik, John M. Mordeson, 
M.K. Sen. McGraw-Hill, 1997, xix + 636 pp, 
$64.63. [ISBN 0-07-040035-0] Standard top- 
ics (groups through Sylow theorems, solvabil- 
ity, nilpotence; rings and modules, including 
UFD’s, Noetherian, Artinian, matrices; fields, 
Galois theory), plus sections on coding theory 
and Grdbner bases. Proofs very calculational; 
worked-out exercises after each section. RM 


Algebra. Groups as Galois Groups: An Intro- 
duction. Helmut Volklein. Stud. in Adv. Math., 
V. 53. Cambridge Univ Pr, 1996, xvii + 248 pp, 
$49.95. [ISBN 0-521-56280-5] The first half 
develops the background (covering space the- 
ory, Riemann surfaces, number theory) to study 
the inverse Galois problem. The second half 
presents recent results (braid group actions, em- 
bedding problems, moduli spaces). TH 


Real Analysis, T(17: 2). Real Analysis. An- 
drew M. Bruckner, Judith B. Bruckner, Brian S. 
Thomson. Prentice-Hall, 1997, xiv + 713 pp. 
[ISBN 0-13-458886-X] Includes material on 
measure theory, Banach spaces, Hilbert space, 
and pointwise convergence of Fourier series. 
Develops some of the theory in the (numerous) 
exercises. PG 


Numerical Analysis, P. The Mathematics 
of Numerical Analysis. Eds: James Renegar, 
Michael Shub, Steve Smale. Lect. in Appl. 
Math., V.32. AMS, 1996, xi+927 pp, $125 (P). 
[ISBN 0-8218-0530-4] Proceedings of a 1995 
AMS-SIAM Summer Seminar in Park City, 
Utah. 


Algebraic Geometry, T(17: 2), P. Combinato- 
rial Convexity and Algebraic Geometry. Giinter 
Ewald. Grad. Texts in Math., V. 168. Springer- 
Verlag, 1996, xiv + 372 pp, $59. [ISBN 
0-387-94755-8] Shows the relationship be- 
tween combinatorial and algebraic geometry via 
torus embeddings. Topics: polytopes, polyhe- 
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dral sets, toric varieties, sheaves, cohomology. 
Assumes only linear algebra and calculus. JD 


Topology, T(18: 2), P. Knots, Links, Braids 
and 3-Manifolds: An Introduction to the New 
Invariants in Low-Dimensional Topology. V.V. 
Prasolov, A.B. Sossinsky. Transl. of Math. 
Mono., V. 154. AMS, 1997, viii + 239 pp, 
$99. [ISBN 0-8218-0588-6] Presents Jones 
and Vassiliev invariants in an elementary man- 
ner, but bulk of book is rigorous geometric 
treatment of Jones-Witten invariants. Includes 
Heegaard decompositions, surgery, Kirby cal- 
culus, branched coverings, but does not cover 
any physics. Exercises with solutions. JD 


Operations Research, T(17-18), P. Stochastic 
Programming Problems with Probability and 
Quantile Functions. Andrey I. Kibzun, Yuri S. 
Kan. Intersci. Ser. in Systems & Optim. Wiley, 
1996, xiii + 301 pp, $64.95. [ISBN 0-471- 
95815-8] Stochastic programming problems 
arise when optimizing a probability function 
that is the expectation of the indicator func- 
tion on a set. Traditional optimization tech- 
niques fail because of the underlying discon- 
tinuity. Uses examples to establish the central 
problems, then presents a theoretical framework 
that allows the calculation of probability and 
quantile functions. MPR 

Optimization, P. Quasidifferentiability and 
Nonsmooth Modelling in Mechanics, Engineer- 
ing and Economics. Vladimir F. Dem’ yanov, 
et al. Nonconvex Optim. & Its Applic., V. 10. 
Kluwer Academic, 1996, xvii + 348 pp, $169. 
[ISBN 0-7923-4093-0] 


Probability, T(17: 1), P, L. Probability: A Sur- 
vey of the Mathematical Theory, Second Edi- 
tion. John W. Lamperti. Ser. in Prob. & Stat. 
Wiley, 1996, x + 189 pp, $39.95. [ISBN 0- 
471-15407-5] Designed for a second course 
in probability, assuming some background in 
measure theory (outlined in an appendix). Four 
chapters: Foundations; Laws of Large Numbers 
and Random Series; Limiting Distributions and 
the Central Limit Problem; the Brownian Mo- 
tion Process. (First Edition, TR, October 1967; 
Extended Review, February 1969.) RSK 


Stochastic Processes, T(16—17: 1), L. The 
Analysis of Time Series: An Introduction, 
Fourth Edition. C. Chatfield. Stat. Textbook 
Ser. Chapman & Hall, 1995, xii + 241 pp, 
$34.95 (P). [ISBN 0-412-31820-2] Covers 
stationary, nonstationary, and bivariate pro- 
cesses; modeling in the time domain; forecast- 
ing; spectral analysis; linear systems. Appendix 
on Laplace, Fourier and Z-transforms; does not 
assume prior knowledge of transforms. New 
chapter on state-space models and the Kalman 
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filter. Other chapters revised to clarify and up- 
date. (Third Edition, TR, January 1986.) LB 


Stochastic Processes, T(18: 2), P. Time Series 
Analysis: Nonstationary and Noninvertible Dis- 
tribution Theory. Katsuto Tanaka. Ser. in Prob. 
& Stat. Wiley, 1996, x + 623 pp, $89.95. [ISBN 
0-471-14191-7] Methods for noninvertible or 
nonstationary linear time series. Assumes back- 
ground in mathematical statistics, including sta- 
tionary stochastic processes. LB 


Mathematical Statistics, T(16-17: 2), S, P*, 
L. Operational Subjective Statistical Methods: 
A Mathematical, Philosophical, and Historical 
Introduction. Frank Lad. Ser. in Prob. & Stat. 
Wiley, 1996, xix + 484 pp, $69.95. [ISBN 
0-471-14329-4] Somewhat mistitled. Mainly 
a formal development of material needed for 
mathematical statistics, primarily probabilistic, 
‘as it is understood in the subjectivist perspec- 
tive.” This viewpoint, based in large part on 
the work of Bruno de Finetti, discounts much 
of current statistical practice on philosophical 
grounds. It is “set in the context of an oper- 
ational and positivist approach to science, an 
analytic approach to philosophy, and a con- 
structivist, finitist, and intuitionist approach to 
mathematics.” Contains many historical and 
philosophical notes in making the case for this 
admittedly controversial minority view. RSK 


Mathematical Computing, S(13—16). Maple 
V Primer, Release 4. Frank Garvan. CRC Pr, 
1997, 143 pp, $14.95 (P). [ISBN 0-8493-268 1- 
8] Pocket guide to Maple V Release 4. Lots 
of examples. MPR 


Mathematical Computing, S*(14-15), C. 
HP-48G/GX Investigations in Mathematics. 
Donald R. LaTorre, Donald L. Kreider, T.G. 
Proctor. Charles River Media (POB 417, 403 
VFW Dr., Rockland, MA 02370), 1996, xiii + 
636 pp, $29.95 (P), withdisk. [ISBN 1-886801- 
23-1] <A detailed, rich, clearly written collec- 
tion of guided investigations, using the HP-48, 
into areas of calculus, differential equations, en- 
gineering mathematics, and linear algebra. In- 
cludes many examples and (solved) exercises, 
sample programs, and a basic introduction to 
the machine. Disk contains a large collection 
of special-purpose programs transferable (with 
appropriate interface kit) to the HP-48. PZ 
Computer Science, P. Exploring Java. Patrick 
Niemeyer, Joshua Peck. O’ Reilly & Associates, 
1996, xv + 407 pp, $24.95 (P). [ISBN 1-56592- 
184-4] 

Computer Science, P. Algebraic 3-D Model- 
ing. Andreas Hartwig. AK Peters, 1996, x + 
222 pp, $59. {ISBN 1-56881-023-7] Math- 
ematical treatment of geometric modeling, in 
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particular boundary representations, in com- 
puter science. Aims for a generalized algebra 
based on Boolean set operations, hull construc- 
tions, and combinatorial techniques applicable 
to a wide range of modeling environments. 
Uses a small language modeler called GECO 
to compare different approaches. MPR 


Applications (Fluid Mechanics), P. Mathe- 
matical Problems in the Theory of Water Waves. 
Eds: F. Dias, J.-M. Ghidaglia, J.-C. Saut. Con- 
temp. Math., V. 200. AMS, 1996, xxiii + 
235 pp, $55 (P). [ISBN 0-8218-0510-X] Pro- 
ceedings of a 1995 workshop at CIRM in Lu- 
miny, France. 


Applications (Physics), P. Topics in Statistical 
and Theoretical Physics: FA. Berezin Memorial 
Volume. Eds: R.L. Dobrushin, et al. AMS 
Transl. Ser. 2, V. 177: Adv. in Math. Sci., V. 32. 
AMS, 1996, ix + 223 pp, $99. [ISBN 0-8218- 
0425-1] 12 papers by students and colleagues 
of Berezin. 


Applications (Quantum Theory), T(18: 2), 
S, P. Lie Groups, Lie Algebras, Cohomol- 
ogy and Some Applications in Physics. José 
A. de Azcarraga, José M. Izquierdo. Mono. 
on Math. Physics. Cambridge Univ Pr, 1995, 
Xvii + 455 pp, $100. [ISBN 0-521-46501- 
X] Largely self-contained, but assumes some 
knowledge of differential geometry, Cartan cal- 
culus, and quantum field theory. The authors are 
superb mathematicians and good writers. MU 
Applications (Relativity), T(18: 2), S, P. 
Global Lorentzian Geometry, Second Edition. 
John K. Beem, Paul E. Ehrlich, Kevin L. Easley. 
Mono. & Textbooks in Pure & Appl. Math., 
V. 202. Marcel Dekker, 1996, xiv + 635 pp, 
$175. [ISBN 0-8247-9324-2] Contains addi- 
tional material on stability, gravitational plane 
wave space-times, and the splitting problem. 
(First Edition, TR, May 1982.) MU 


Applications, P. Nonlinear Mathematics and 
Its Applications. Ed: Philip J. Aston. Cam- 
bridge Univ Pr, 1996, vii + 256 pp, $24.95 (P). 
[ISBN 0-521-57676-8] 9 papers from a 1995 
Spring School for postgraduate students at the 
University of Surrey. Applications in engineer- 
ing, fluid dynamics, material science, and biol- 
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LB: Lynne Baur, Carleton; DB: David Bressoud, 
Macalester; LC: Laura Chihara, St. Olaf; JD: Jill Dietz, 
St. Olaf; PF: Paul Froeschl, Macalester; PG: Philip Gloor, 
St. Olaf; TH: Tom Halverson, Macalester; RSK: Richard 
S. Kleber, St. Olaf; LCL: Loren C. Larson, St. Olaf; RM: 
Richard Molnar, Macalester; MPR: Matthew P. Richey, 
St. Olaf; MU: Milton Ulmer, Carleton; PZ: Paul Zom, 
St. Olaf. 
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University (Kingston, Ont.) and McGill University (Montreal). His speciality at York was an undergrad- 
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EDITOR’S ENDNOTES 


Stimulated by Margaret Kleinfeld’s article “Calculus: Reformed or Deformed?”’, 
MONTHLY 103 (1996) 230-232, James Sandefur wrote: 


Professor Kleinfeld, among other things, argued for teaching fewer applications in mathematics 
courses. She offered the analogy of a guide in a foreign country showing a cathedral. The group 
became bored and asked to see the shopping center, analogous to our students getting bored and 
asking for applications. I would like to take her analogy one step further. Having recently been 
to Europe and toured many cathedrals, I have had tour guides who only describe the cathedral, 
“This archway is 20 meters high, the stained glass window is... .” The tours I have enjoyed the 
most integrated the actual building with the history (when built, wars in which it was damaged, 
parts added, etc.), the art (the paintings on the ceiling were by so-and-so, with the style coming 
from...), and the purpose (this cathedral was built to house the... ). Just as the cathedral is 
more than a collection of stone and stained glass, mathematics is more than a collection of 
formulas and abstract ideas. We should give our students an appreciation for the genius it took 
to come up with a particular new idea that may seem simple now, just as the perspective of a 
painting in a cathedral took genius at the time it was painted, although today that use of 
perspective is common. Our students should understand the historical context of certain 
mathematics, such as the applications that led to a new advancement, just as the guide should 
describe the historical development of the cathedral. But most importantly, what students will 
remember best is the mathematics they have had time to explore, just as I remember best the 
parts of the cathedral I explored on my own. So when the group asks to see the shopping center, 
it’s not that they don’t like the cathedral; it is that they don’t like the guide. 


The complete publication information for a recently-reviewed book (MONTHLY 
103 (1996) 705) is: In Search of Infinity, by Naum Ya. Vilenkin. Translated by Abe 
Shenitzer with the editorial assistance of Hardy Grant and Stefan Mykytiuk. 
Birkhauser, 1995, 145 pp., $24.50. 

The Challenge to identify the cover illustration on the August 1995 issue 
(MONTHLY 102 (1995) 660) resulted in 49 replies to David Fowler. None of the 
respondents had seen the plot before; the following identified it correctly: Juan 
Arias-de-Reyna, Donald Bridges, Henry Edwards, David V. Feldman, Joran 
Friberg, Dean P. Foster, Niels-Henrik Holstein-Rathlou, Peter Jones, Azzedine 
Kaced, Steve Kass, K. Robin McLean, John Mason, Joe Moser, Les Reid, Joel E. 
Rosenberg, Jeremy T. Tyson, and Hansklaus Rummler. Professor Fowler provided 
full details in his article “The Binomial Coefficient Function,’ MONTHLY 103 
(1996) 1-17. 

Victor Klee offered the following comments on James Angelos et al., “‘Packabil- 
ity of Five Spheres on a Sphere Implies Packability of Six,” MONTHLY 103 (1996) 
894-896: 


... their result... goes back at least to K. Shiitte and B. van der Waerden, Math. Ann. 123 
(1951) 96-124. The relationship (of six points on the sphere to five points) that they establish has 
also been established for...12 points to 11 points, and there are conjectures in the work of 
Raphael Robinson concerning the same phenomenon for...the cases in which the number of 
points is 24, 48, 60, or 120. However, the conjecture for 24 has been disproved by Tarnai 
in Budapest. 


Finally, the charming photo of the Chez Math sign in Maastricht in the 
December ‘96 issue (MONTHLY 103 (1996) 845) has a scrambled attribution. It was 
contributed by Evan Romer at Susquehanna Valley High School, Conklin, NY. 


Roger A. Horn, Editor 
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Princeton Mathematics 


Now Available! The Emergence New paperback 
Three- of Complexity in edition = 
Dimensional Mathematics, Convex Analysis 
Geometry and Physics, R. Tyrrell Rockafellar 


“This book should remain 


Topology Chemistry, and for some years as the stan- 


dard reference for anyone 


Volume | Biology interested in convex analysis.” 
William P. Thurston ; —J. D. Pryce, Edinburgh 
Edited by 


Edited by Silvio Levy Mathematical Society 


Princeton Landmarks in Mathematics 
Princeton Mathematical Series, 35: Bernard P ullman and Physics 


Luis A. Caffareili, John N. Mather, Proceedings of the Pontifical Academy of Paper $22.95 ISBN 0-691-01586-4 
and Elias M. Stein, Editors Sciences 
Cloth $39.50 ISBN 0-691-08304-5 Paper $39.50 ISBN 0-691-01238-5 


Due Summer 


Princeton University Press 


AVAILABLE AT FINE BOOKSTORES OR DIRECTLY FROM THE PUBLISHER: 800-777-4726 
VISIT OUR WEBSITE: PUP.PRINCETON.EDU 


Join us for a World Class 
Meeting in America’s Olympic City 


ar MAA Summer 
oo MATHFEST 


Atlanta, Georgia 


For details, look up "Meetings" on 
MAA On-line: http:/Wwww.maa.org 


The Lighter Side 
of Mathematics 


Proceedings of the Eugéne Strens Memorial Conference 
on Recreational Mathematics and its History 


Richard K. Guy and 
Robert E. Woodrow, Editors 


The level of exposition is high, and the fun infectious. 
The reader can find routes to serious mathematics, 
such as hyperbolic geometry, fractals, group theory, 
and number theory, all beginning with a delightful 
puzzle. A sparkling addition for any library where the 
lover of mathematics at any level comes for support. 
—Choice 


The book is a fantastic feast of far-from-trivial topics. 
Entertaining mathematics not only can lead to unexpect- 
ed applications...but it is one of the best ways to stimu- 
late interest in mathematics among both students and 
the general public. 

—Martin Gardner, American Scientist 


In August of 1986 a special conference on recreational 
mathematics was held at the University of Calgary to 
celebrate the founding of the Strens Collection. Leading 
practitioners of recreational mathematics from around 
the world gathered in Calgary to share with each other 
the joy and spirit of play that is to be found in recreation- 
al mathematics. 


The papers in this volume represent a treasure trove of 
recreational mathematics by a star-studded cast: Leon 
Bankoff, Elwyn Berlekamp, H.S.M. Coxeter, Ken Falconer, 
Branko Griinbaum, Richard Guy, Doris Schattschneider, 
David Singmaster, Athelstan Spilhaus, Stan Wagon and 
many others. 


If you are interested in tessellations, Escher, tiling, 
Rubik’s cube, pentominoes, games, puzzles, the arbelos, 
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MATHEMATICS ANO IYS HISTORY 


Henry Dudeney, or change ringing, then this book is a 
must for you. 


376 pp., Paperbound, 1994 
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A Radical Approach 
to Real Analysis 


David Bressoud 


What is radical about this book as real analysis books 
go, is its stronger historical approach...The past decade 
or so has witnessed the appearance of a substantial 
number of “bridge the gap” introductions to real analy- 
sis which lead the students at a gentler pace through the 
fundamentals of real analysis according to the tradition- 
al syllabus. It is well worth considering whether stu- 
dents in their first undergraduate real analysis course 
might be better served by a radical approach such as 
Bressoud's. 


—Mathematical Reviews 


The book can be recommended as a resource for 
instructors, and as collateral reading for students who 
may wonder how and why the early pioneers devel- 
oped concepts such as continuity, differentiability, 
integrability, and uniform convergence. 
—Zentrallblatt fir Mathematik 


The book ..will appeal as a text; it should be in 
every library as a reference. 
—Choice 


This book is an undergraduate introduction to real analy- 
sis. Teachers can use it as a textbook for an innovative 
course, or as a resource for a traditional course. Students 
who have been through a traditional course, but do not 
understand what real analysis is about and why it was 
created, will find answers to many of their questions in 
this book. 


The book begins with Fourier’s introduction of trigono- 
metric series and the problems they created for the 
mathematicians of the early nineteenth century. 


Cauchy’s attempts to establish a firm foundation for 
calculus follow, and the author considers his failures 
and his successes. The book culminates with 
Dirichlet’s proof of the validity of the Fourier series 
expansion and explores some of the counterintuitive 
results Riemann and Weierstrass were led to as a result 
of Dirichlet’s proof. 


Mathematica commands and programs are included in 
the exercises. However, you may use any mathemati- 
cal tool that has graphing capabilities including the 
graphing calculator. 


336 pp., Paperbound, 1994 ISBN 0-88385-701-4 
List: $32.95 MAA Member: $25.50 
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CRYPTOLOGY 


Albrecht Beutelspacher 


This fascinating little book is eminently readable, and it 
is a great deal of fun to peruse... the book is a real treat. 
We need more books like this, crafted by expert hands yet 
crafted so that the general reader can enjoy them. 
—Bulletin of The Institute of Combinatorics and 
Its Applications 


This excellent and entertaining book is suitable for a first 
course in cryptology for mathematical enthusiasts. An 
abundance of exercises and an excellent list of related ref- 
erences are included. 
—The Mathematics Teacher 


In spite of the light-hearted style in which the book is 
written throughout, it is a serious—-and successful—- 
attempt to explain the basis of coding and decoding mes- 
sages...I can strongly recommend this book to anyone who 
wants a brief but comprehensive, eminently readable, and 
up-to-date introduction to this increasingly popular topic. 
— The Mathematical Gazette 


All of cryptology is covered in this work...Occupying a 
niche in the halls of the ivory tower of pure mathematics 
for nearly two millennia, number theory now forms a pil- 
lar of modern society. This book is the best explanation 
available today of how that pillar was constructed. 

— Charles Aschbacher 


A model to follow in order to make mathematics better 
known and understood. Accessible to a broad audience. 
Have fun reading this book, while you are getting a better 
understanding of cryptology. 

— Bulletin of the Belgian Mathematics Society 


How can messages be transmitted secretly? How 
can one guarantee that the message arrives safely 
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in the right hands exactly as. it was transmitted? 
Cryptology—the art and science of “secret writ- 
ing”—provides ideal methods to solve these prob- 
lems of data security. 


The book is fun to read, and the author presents 
the material clearly and simply. Many exercises 
and references accompany each chapter. 


176 pp., Paperbound, 1994 
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EPISODES 


in Nineteenth and Twentieth 
Century Euclidean Geometry 


Ross Honsberger 


In this remarkable volume, the author fulfills the 
promise he makes in the preface: “that each topic has been 
extricated from the mass of material in which it is usually 
found and given as elementary and full a treatment as rea- 
sonably possible.” 


Professor Honsberger has succeeded in “finding” and “extricat- 
ing” unexpected and little known properties of such fundamen- 
tal figures as triangles, results that deserve to be better known. 
He has laid the foundations for his proofs with almost entirely 
synthetic methods easily accessible to students of Euclidean 
geometry. While in most of his other books Honsberger presents 
each of his “gems,” “morsels,” and “plums” as self-contained 
tidbits, in this volume he connects chapters with some deduc- 
tive threads. He includes exercises and gives their solutions at asuaiiadiel : 
the end of the book. Ross Honsberger 


In addition to appealing to lovers of synthetic geometry, this 

book will stimulate also those who, in this era of revitalizing 7, The Symmedian Point 
geometry, will want to try their hands at deriving the results g The Miguel Theorem 

by analytic methods. Many of the incidence properties call to 9. The Tucker Circles 
mind the duality principle; other results tempt the reader to 19. The Brocard Points 
prove them by vector methods, or by projective transforma- 1), The Orthopole 

tions, or complex numbers. 12. The Cevians 


13. The Theorem of Menelaus Suggested Reading 


Content 
I Cleavers and Splitters Solutions to the Exercises; Index 
2. The Orthocenter 163 pp., Paperbound, 1995 
3. On Triangles ISBN 0-88385-639-5 
4. On Quadrilaterals List: $32.95 
5. A Property of Triangles MAA Member: $25.50 
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A lab manual with software for introductory courses 
in group theory or abstract algebra 


Laboratory Experiences in Group Theory is a workbook 
of 15 laboratories designed to be used with the software 
Exploring Small Groups as a supplement to the regular 
textbook in an introductory course in group theory or 
abstract algebra. Written in a step-by-step manner, the 
laboratories encourage students to discover the basic 
concepts of group theory and to make conjectures from 
examples that are easily generated by the software. 
The labs can be assigned as homework or can be used 
in a structured laboratory setting. Since the software is 
user-friendly and the laboratories are complete, stu- 
dents and faculty should have no difficulty in using the 
labs without training. 


Most students find that the laboratories provide an 
enjoyable alternative to the “theorem-proof-example” 
format of a standard abstract algebra course. At the end 
of the semester, one student wrote in his evaluation of 
the course: 


I am truly grateful for the laboratory component...Work 
on the computer helped to make the abstract theory 
more concrete... One of the best things about the labs 
was that we formed our own conjectures about the pat- 
terns we saw...I believe that the progression of (1) lab, 
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Laboratory Experiences 
in Group Theory 


A Manual to be Used with 
Exploring Small Groups 


Ellen Maycock Parker 


Series: Classroom Resource Materials 


(2) conjecture, (3) class discussion, and (4) proof was 
highly beneficial in gaining understanding of the 
abstract material of the course. 


Table of Contents: 1. Groups and Geometry; 2. Cayley 
Tables; 3. Cyclic Groups and Cyclic Subgroups; 4. 
Subgroups and Subgroup Lattices; 5. The Center and 
Commutator Subgroups; 6. Quotient Groups; 7. Direct 
Products; 8. The Unitary Groups; 9. Composition 
Series; 10. Introduction to Endomorphisms,; 11. The 
Inner Automorphisms of a Group; 12. The Kernel of an 
Endomorphism; 13. The Class Equation; 14. Conjugate 
Subgroups; 15. The Sylow Theorems; Appendix A. 
Table Generation Menu of Exploring Small Groups 
(ESG); Appendix B. Sample Library of ESG; Appendix 
C. Group Library of ESG; Appendix D. Group 
Properties Menu 


Exploring Small Groups, the software packaged with 
this lab manual, is on a 34/2” DD PC compatible disk. 
This is a DOS program that can be run in Windows. 
The software was developed by Ladnor Geissinger, 
University of North Carolina at Chapel Hill. 
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Lion Hunting 


and Other Mathematical Pursuits 


A Collection of Mathematics, Verse, and Stories 


by Ralph P. Boas, Jr. 


Gerald L. Alexanderson and 
Dale H. Mugler, Editors 


I highly recommend Lion Hunting and Other 
Mathematical Pursuits to high school mathematics 
clubs, mathematics teachers of all levels, and anyone 
interested in mathematics. Perhaps the most impor- 
tant features of this book is how it subtly makes the 
reader aware of the nature of mathematics. 


— The Mathematics Teacher 


As a young man at the Institute for Advanced Study in 
Princeton, Ralph Philip Boas, Jr., together with a group of 
other mathematicians, published a light-hearted article on 
the “mathematics of lion hunting” under a pseudonym 
(1938). This sparked a sequence of articles on the topic, 
several of which are drawn together in this book. 


Lion Hunting includes an assortment of articles that show 
the many facets of this remarkable mathematician, editor, 
writer, and teacher. Along with a variety of his lighter 
mathematical papers, the collection includes Boas’ verse 
and short stories, many of which are appearing for the first 
time. Anecdotes and recollections of his numerous experi- 
ences and of his work and meetings with many distin- 
guished mathematicians and scientists of his day are also 
included as well as photographs taken by Boas of Hardy, 
Littlewood, Besicovitch, Weil, and others. 


The mathematical articles in this collection cover a range 
of topics. They include articles on infinite series, the mean 
value theorem, indeterminate forms, complex variables, 
inverse functions, extremal problems for polynomials and 
more. A special section of this book is devoted to articles 
about the teaching of mathematics, with titles such as 
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“Calculus as an experimental science” and “Can we make 
mathematics intelligible?” 


Boas’s wit and playful humor are reflected in the verses 
included in this collection. The verses reflect the phases of 
his career as author, editor, teacher, department chair, and 
lover of literature. A section of the book describes the feud 
that Boas supposedly had with Bourbaki. Also included are 
many amusing anecdotes about famous mathematicians. 
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Linear Algebra 
Problem Book 


Paul R. Halmos 


Were it possible for the experience of apprentice- 
ship to a master of mathematics to be packaged 
between the covers of a book, this would be it. 
No teacher of linear algebra should neglect to 
consult it. Highly recommended for all libraries. 


— Choice Magazine 


This is a book for mathematicians at all levels. Paul 
Halmos tells us, “Even if I know some answers, I don’t 
think I understand a subject until I know the questions. 
The questions in mathematics are called problems— 
and although I learned some linear algebra a long time 
ago, until now I have made no serious effort to examine 
the problems that the solutions are based on. I wrote 
this book to organize those questions—problems—in 
my own mind.” 


This book is useful to anyone who needs linear alge- 
bra—and nowadays that means every user of mathe- 
matics. It can be used as the basis of either an official 
course or a program of private study. 


If used as a course, the book can stand by itself, or if so 
desired, it can be stirred in with a standard linear alge- 
bra course as the seasoning that provides the interest, 
the challenge, the motivation that is needed by experi- 
enced scholars as much as by beginning students. 
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Careers in 
Mathematics 


Andres Sterrou, bdieor 


Read the biographical essays written by individ- 
uals who have gotten exciting good-paying jobs 
by preparing themselves with a solid back- 
ground in the mathematical sciences. It will 
provide you and your students with a wealth of 
information about the types of different career 
paths that can be chosen for those who are 
well-prepared in mathematics. 


These mathematicians are found: 

e in well-known companies such as IBM, 

AT&T, and American Airlines, 

e in some surprising places like FedEX 
Corporation, L. L. Bean, Perdue Farms, 
in government agencies 
in the arts (sculpture, music, and television), 
in the professions (law and medicine), and 
in education (elementary, secondary, college 
and university) 


Monday — Friday 8:30 am — 5:00 pm 


Phone in Your Order Now! ®& 1-800-331-1622 


101 Careers in 
Mathematics 


Andrew Sterrett, Editor 


Series: Classroom Resource Materials 


A career guide 
for your students. 
If they want to know 
why they should 
study mathematics, 
this book will tell 
them. 


Many of these individuals have started their 
own companies. 


Your students will see how these individuals use 
their mathematical sciences training on a daily 
basis in their work, often relying on the general 
problem-solving skills they have acquired in 
their mathematics courses. Those who studied 
statistics and computer science as well as mathe- 
matics, tell how their training in these disciplines 
helped them advance in their careers. 


Articles in the Appendix reprinted from the 
MAA’s magazine for students, Math Horizons, 
provide valuable advice on looking for a job 
and the expectations of industry. 


Catalog Code: 101/JR 
260 pp., 1996, Paperbound, ISBN 0-88385-704-9 
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The use of the history of mathematics in the teaching 
of mathematics at all levels is an idea whose time has 
come. To use history in the teaching of undergradu- 
ate mathematics, the instructor must be familiar with 
the history as well as the mathematics. Vita 
Mathematica will enable college teachers to learn the 
relevant history of various topics in the undergradu- 
ate curriculum and help them incorporate this history 
in their teaching. 


For example, should calculus be approached from a 
geometric or an algebraic point of view? The book 
shows us how two important eighteenth century 
mathematicians, Colin Maclaurin and Joseph-Louis 
Lagrange, understood the calculus from these differ- 
ent standpoints and how their legacy is still impor- 
tant in teaching calculus today. We also learn why 
Lagrange’s algebraic approach dominated teaching in 
Germany in the nineteenth century. Some of the rea- 
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Vita Mathematica 


Historical Research and Integration with Teaching 


Ronald Calinger, Editor 


sons for this are related to the appropriate founda- 
tions of the calculus, and so the book traces the 
ancient history of one of the possible foundations, 
the concept of indivisibles. Even though we general- 
ly do not use this concept formally today, many ideas 
for a heuristic approach to the calculus can be devel- 
oped out of his study. 


Vita Mathematica contains numerous other articles 
dealing with calculus, with algebra, combinatorics, 
graph theory, and geometry, as well as more general 
articles on teaching courses for prospective teachers. 
This volume, then, demonstrates that the history of 
mathematics is no longer tangential to the mathemat- 
ics curriculum, but in fact deserves a central role. 
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Learn trom 
the Masters 


Frank Swetz, John Fauvel, Otto Bekken, 
Bengt Johansson, Victor Katz, Editors 


Provides high school and college teachers with important 
historical ideas and insights which can be immediately 
applied in the classroom. 


This book is for college and high school teachers who 
want to know how they can use the history of mathe- 
matics as a pedagogical tool to help their students con- 
struct their own knowledge of mathematics. Often, a 
historical development of a particular topic is the best 
way to present a mathematical topic, but teachers may 
not have the time to do the research needed to present 
the material. This book provides its readers with histor- 
ical ideas and insights which can be immediately 
applied in the classroom. 


The book is divided into two sections: the first on the use 
of history in high school mathematics, and the second on 
its use in university mathematics. So, high school teach- 
ers planning a discussion of logarithms, will find here the 
historical background of that idea along with suggestions 
for incorporating that history in the development of the 
idea in class. College teachers of abstract algebra will ben- 
efit by reading the three articles in the book dealing with 
aspects of that subject and considering their ideas for pre- 
senting groups, rings, and fields. 


The articles are diverse, covering fields such as 
trigonometry, mathematical modeling, calculus, linear 
algebra, vector analysis, and celestial mechanics. Also 
included are articles of a somewhat philosophical nature, 
which give general ideas on why history should be used 
in teaching and how it can be used in various special 
kinds of courses. Each article contains a bibliography to 
guide the reader to further reading on the subject. 


LEARN FROM THE 
MASTERS 


EDEHTORE 
brank Swer: Jehn ba 


Deer jou 
Reng: tehvasson \ acter Kart 


ver Oro Bekken 
om 


Pay 


THE MATHEMATICAE ASSOCIATION OF AMERICA 


This book grew out of a conference in Norway which 
brought together mathematicians and mathematics 
educators from a dozen countries who were interested 
in the use of the history of mathematics as a pedagogi- 
cal tool in the teaching of mathematics. Since the con- 
ference which provided the genesis of this book took 
place in Norway near the home where Niels Henrik 
Abel spent his final days, the book’s title comes from a 
note scribbled in one of Abel’s notebooks: “It appears to 
me that if one wants to make progress in mathematics 
one should study the masters.” The authors hope that 
readers will benefit from Abel’s advice and show their 
students how they too can Learn from the Masters. 
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Algebra and Tilin 


Homomorphisms in the Service of Geometry 


Sherman Stein and Sandor Szab6 


Algebra and Tiling is perfect for bringing alive an 
abstract algebra course. Intuitive but difficult problems 
of geometry are translated into algebraic problems more 
amenable to solution. Full of nice surprises, the book is a 


pleasure to read. 
—Choice 
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at all levels—from beginning students to practic- 
ing analysts—with the basic concepts and stan- 
dard tools necessary to understand analytical 
methods and better apply them to research in a 
variety of areas. 


Analysis takes readers quickly from basic topics 
to applications (many of them quite deep), 
incorporating only those results and construc- 
tions that work successfully in mathematics and 
its applications. The authors take great care to 
include topics that any working analyst uses in 
everyday practice. 


The book covers measure and integration, the- 
ory of L?spaces, distribution theory, Fourier 
analysis, potential theory, Sobolev spaces, and 
much more. Analysis is a unique, practical book 
that everyone—from the graduate student, to 
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made possible during the last fifteen years due 
to the methodologies of stochastic analysis and 
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Approximate Isometries on Euclidean Spaces 


Rajendra Bhatia and Peter Semrl 


1. INTRODUCTION. Let E and F be Banach spaces. An isometry from E to F 
is amap f: E — F such that 


f(x) — f(y) =llx — yl] forall x,y € E. (1) 


Every isometry is continuous and injective. Among the earliest theorems for 
Banach spaces is the Mazur-Ulam Theorem [13]. This says that if f is a surjective 
isometry between real Banach spaces FE and F, and if f(0) = 0, then f is linear. 
The conclusion is not valid for complex Banach spaces (just consider the complex 
conjugation on C). The hypothesis of surjectivity is essential in general, but can be 
dropped for a large class of Banach spaces that includes real Hilbert spaces. The 
condition f(0) = 0 is necessary for f to be linear. If f is any isometry then f — f(0) 
is also an isometry, so this condition is no serious restriction. 

If distances are known imprecisely one may not be able to say whether f is an 
isometry. Then the concept of an approximate isometry is useful. Given ¢ > 0, a 
map f: E — F is called an e-isometry if 


IIf(%) -f(y) | -llx -—ylll<e forall x,y €E. (2) 


Note that if f is an e-isometry then so is f — f(0). The following problem was 
posed by Hyers and Ulam [9]. If f is a surjective e-isometry between real Banach 
spaces E and F such that f(0) =0, then does there exist a surjective linear 
isometry g: E — F such that 


| f(x) -—g(x)|_< Ke forall x €E, (3) 


where the constant K is independent of f, but can depend on the spaces F and 
F? Hyers and Ulam [9] showed that if E = F is a real Hilbert space then the 
answer is in the affirmative with K < 10. 

The Hyers-Ulam problem has been solved over the years. It is only recently that 
it was shown that the sharp value of K is 2 for all Banach spaces [14]. 

The aim of this note is to discuss some of these matters, to explain a part of the 
original Hyers-Ulam ideas, and to show how to extract some more results from 
them. One major issue of concern through the article is how essential the 
assumption of surjectivity of f is for the conclusions. 


2. ISOMETRIES. Let FE be any Banach space and let x, y be any two points of 
E. The algebraic midpoint of x and y is the vector m(x, y) = (x + y)/2. A metric 
midpoint of x and y is any point z of E that satisfies 


1 
lz — xl] =z -yl = Ze —-yI. (4) 


The algebraic midpoint is always a metric midpoint. It is easy to see that if E is a 
Hilbert space there are no other metric midpoints for any pair of vectors x, y. This 
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is not always so in all Banach spaces. Here is an easy example: 
Let E be the space R* with the l,-norm; ie., if x =(x,,x,) then ||x|| = 
|x,| + |x,|. Let x = (1,0) and y = (0,1). The algebraic midpoint of x and y is 
5,3). This is at distance 1 from x and y. So are all points z of the form (t, £), 
where 0 <¢ <1. All these points are metric midpoints of x and y. A pictorial 
representation of this phenomenon might be helpful; see Figure 1. The unit ball of 
E is a diamond centered at the origin. Shift this diamond’s center to (1, 0) and then 
to (0, 1). The intersection of the boundaries of these two diamonds is precisely the 
set of metric midpoints of x and y. 


Figure 1 


Let M,(x, y) be the set of all metric midpoints of x and y. It is easy to see that 
M,(x, y) is a closed, convex, and bounded subset of E. 

There is a class of Banach spaces in which the norm is chosen so as to ensure 
that for all pairs x, y the set M,(x, y) is just the singleton {m(x, y)}. These are the 
strictly convex Banach spaces. The space E is called strictly convex, if whenever 
llxll = llyll = 1 and |l(x + y)/2|| = 1, then x = y (that is, every point of the unit 
ball of E is an extreme point), For 1 <p <~%, J, is strictly convex. A simple 
calculation with norms shows that if E is strictly convex then M,(x, y) = {m(x, y)} 
for all x, yE EF. 

The importance of this observation is the following. The relation (4) that defines 
metric midpoints is unchanged under isometries, so if f: E — F is an isometry and 
if F is strictly convex then 

x) + f(y) 


r= a” = fm, ¥)) = m(F(4), £0) = “ 2 


Thus every isometry f from a Banach space E into a strictly convex Banach space 
F satisfies the equation 


x+y) f(x) +f(y) 
i 2 | 7 2 
Now if f(0) = 0, this says that f(x /2) = f(x)/2 for all x. It follows, again from (5), 
that f is additive: 
f(xt+y) =f(*) +f(y), forall x,y €E. (6) 


It is clear from this equation that f(mx) = nf(x), for every positive integer n. Also, 
choosing y = —x in (6) we see that f(x) = —f(x) for all x. Hence, f(mx) = nf(x) 
for every integer n. Now it is easy to see that f(rx) = 7f(x) for every rational 


forall x,y EE. (5) 
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number r. Since f is continuous, for all real a we have 
f(ax) =af(x) forall xe E. 


Thus f is real linear even if the spaces F and F are complex; if they are real then 
f is linear. This proves the Mazur-Ulam Theorem in the special case when the 
space F is strictly convex. Note that in this case we did not require that f be 
surjective. 

When the set M,(x, y) contains points other than m(x, y), the preceding 
argument does not work. However, it is possible to give a metric characterization 
of the algebraic midpoint and then use a modified version of the above argument. 
Here is an outline of the argument. 

Let x, y be any two points of a Banach space E. Starting with the set M,(x, y), 
define, inductively, for n = 1,2,..., 


d,_ 
M,(x,y) = ft Ee M_,_,(x,y): |lu—v] < 5 ~ forallvé M,(29)}. 


Here, d, = diamM,. Then we have a nested sequence of closed sets M,(x, y) > 
M, (x, y)} > M, (x, y) D -::, with diamM, < d,/2”. It is not difficult to prove that 
the point m(x, y) is in M,(x, y) for all n. Hence, 
M,(x,y) = {m(x, y)}. (7) 
n=0 

This gives a metric characterization of the algebraic midpoint m(x, y). 

Now note that if f is a surjective isometry from a Banach space FE onto a 
Banach space F, then 


M,(f(x), fv) =f(M,(x,y)) for all n. 


At this step of the proof we do need to assume that f is surjective. If f were not 
surjective, we could have in F two points f(x) and f(y) whose metric midpoint is 
outside the range of f. So, from (7) we have 


pet = MCA) f0)) = AAC») 


Since f is injective, 
il AMC. 9) = (1 F(MiCs ¥)). 


Now appealing to (7) again, we have 


a) +f0) _ (247) 


As before, from this we can conclude that f is real linear. This proves the 
Mazur-Ulam Theorem. 

Let us now give some simple examples to illustrate the necessity of the 
surjectivity assumption for general Banach spaces. Let E = R and let F = R? with 
the Z-norm; i.e., if x = (x,, x,) then ||x|] = max(|x,|,|x,). Let f: E - F be the 
map f(t) = (t, sin t). Since |sin t — sins| < |t — s| for all ¢ and s, it follows that f is 
isometric. Clearly f is not linear. To see another example, let EF = R and let 
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F = R?’ with the /,-norm. Define the map f: E > F as 
(t, 0) if -l<t<1l 
f(t) =¢(-1,t+1) if t<-l 
(1,¢-1) if ¢21 
This is the piecewise linear curve illustrated in Figure 2. It is easy to verify that 
f(t) —f(s)|| =|t-—s| forallt,s ER. 
Thus f is isometric, but not linear. 


Figure 2 


Can this phenomenon occur if dim EF = dim F? The answer is no for finite- 
dimensional spaces. It was shown by Charzyfhski [5], [6, p. 143] that if E, F are 
n-dimensional real normed spaces then every isometry f: E —F satisfying 
f(O) = 0, is linear. ‘Note that this implies that f is surjective. 

Here is a simple proof of this theorem. Obviously, f maps S;,, the sphere of 
radius r centered at the origin of EF into the sphere of the same type in F. Assume 
that there exists r > 0 such that f(S.) is a proper subset of S%. Take any point 
y € SEN f(S;). Then the restriction of f to S% is an embedding of S; into 
Sr \ {y}. If we have two different norms ||- ||; and ||-||2 on R” then every sphere 
with a positive radius r with respect to the norm ||- ||; centered at 0 is homeomor- 
phic to the unit sphere with respect to ||-||2 (the homeomorphism can be defined 
by x > x/||x\l2 for every x with ||x|l; = 7). Hence, the restriction of f to S; can be 
considered as an embedding of the standard sphere S”~' into the punctured 
sphere S”~', which is homeomorphic to R”~'. It is well-known that such embed- 
dings do not exist. So, f must be surjective, and therefore, by the Mazur-Ulam 
theorem, it is linear. It is interesting to note that the name of Ulam is associated 
also with the theorem from topology used here. This is the Borsuk-Ulam Theorem; 
see [12, p. 170]. 

There is a more general version of the Mazur-Ulam Theorem that goes beyond 
Banach spaces to locally convex topological vector spaces. See [6, Chapter VII]. 
The idea of the proof is essentially the same, but now the algebraic midpoint is 
characterized in terms of prenorms. 


3. APPROXIMATE ISOMETRIES. We have defined e¢-isometries in Section 1 
and explained the Hyers-Ulam problem. Since surjectivity of f is a necessary 
requirement in the Mazur-Ulam Theorem, it is natural to impose that condition 
here too. However, there is a significant difference between the two problems in 
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this respect. In Section 2 we explained how for a large class of Banach spaces 
(including Euclidean spaces) the Mazur-Ulam Theorem can be proved without this 
assumption. Hyers and Ulam gave an example of an e-approximate isometry f 
from R into the Euclidean space R’, with f(0) = 0, that cannot be uniformly 
approximated by any linear isometry g: R — R?’. They defined f: R > R? by 


(t,0) if ¢t<1 


F(t) = (t,clogt) if ¢>1- 


Then for each ¢«, we can choose c such that f is an e«-isometry. To see this note 
that log ¢ is a concave function, and hence for 1 <5 <tf, 
log t — log s log t 
ee < . 
t—s t—1 


Since (log t)/t > 0 as t > ~, this means that || f(t) — f(s)|l is asymptotically like 
It — s|. More formally, it is an easy exercise to show that f is an e-isometry 


whenever 
log t)” 
o> e'max{ S ) ; 


However, the set {|| f(t) — g(D||:¢ © R} is unbounded for every linear isometry 
g:R- R’. 

After the Hyers-Ulam solution of the problem for Hilbert spaces, there were 
several papers giving partial solutions for special Banach spaces. A breakthrough 
was made by Gruber [8], who proved that if a constant K satisfying (3) can be 
found (for a given pair of real Banach spaces E and F) then this inequality 
remains true if we choose K = 5. Further, he proved that this can always be done 
if E and F are finite-dimensional. In the general case of all real Banach spaces 
this was proved by Gevirtz [7]. Finally, it was shown by Omladi¢ and Semrl that 
the choice K = 2 works in (3) for all real Banach spaces E and F [14]. Here 
is a simple example that shows the inequality (3) with K = 2 is sharp. Define 
f: R > R by 


t-1 if ¢+¢€[0,1/2] 
f= fy , 
3r if te ([0,1/2] 
One can easily check that f is a surjective 1-isometry satisfying f(0) = 0. The only 
linear isometries g: R > Rare g(t) = ¢ and g(t) = —t. Obviously, the second one 
does not approximate f uniformly, while max| f(t) — ¢t| =|f(S) — 3] = 2. 


4. EUCLIDEAN SPACES. The Hyers-Ulam example explained in Section 3 can 
be modified to show that if FE and F are real Hilbert spaces with either 
dim E < dim F, or dim E = dim F =, then there exists an  e-isometry 
f: E —> F, f(O) = 0, that is not uniformly close to any linear isometry. Of course, 
such an f is not surjective. 

What happens in the remaining case, dim EF = dim F < ©? The following 
theorem gives the answer. 


Theorem 1. Let E,, be an n-dimensional Euclidean space and let f: E,, > E,, be an 
e-isometry satisfying f(0) = 0. Then there exists a unique bijective linear isometry 
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g: E, > E, such that 


f(x) — g(x) || < 2e 
for allx € E,. 


Our proof has two steps. First we use two theorems from the Hyers-Ulam paper 
to find an isometry g and a constant K (depending on n) such that the inequality 
(3) is true. Then we use the special inner product structure of E,, to show that K 
can be replaced by 2. This argument is simpler than the one in [14] for arbitrary 
Banach spaces, and requires no assumption of surjectivity on f. 

The inner product between two vectors x and y will be denoted by <x, y). Let 
f: E, - E, be an e-isometry satisfying f(0) = 0. Assume for a moment that f can 
be uniformly approximated by a linear isometry g: E,, > E,, that is, there exists a 
positive real constant M such that 


| f(x) -—g(x)|| <M forallxeE,. 


Let m be an arbitrary positive integer. Replacing x in this inequality by 2x, 
dividing the obtained inequality by 2”, and using linearity of g we get 


f(2"x) M 
—g(x)|_ < = forallxeE, 
2” 2” 
and for all positive integers m. This shows that if 
2X 
lim fe") (8) 
moo 2 


exists, then a linear isometry g can approximate f uniformly if and only if g(x) is 
equal to this limit for every x. The sequence in (8) is now called the Hyers-Ulam 
sequence. 

The first result in the Hyers-Ulam paper [9] states that this sequence does 
converge for every x. 


Lemma 2. Let E,, be an n-dimensional Euclidean space. Suppose that « > 0 and that 
f: E, - E, is an e-isometry satisfying f(0) = 0. Then 
f(2"x) 
2m 
exists for every x € E,,. The mapping g is a linear bijective isometry. 


g(x) = lim 
mo 


After this, Hyers and Ulam prove that an e-isometry (not necessarily surjective) 
“approximately preserves” orthogonality, in the following sense: 


Lemma 3. Let f and g be as in Lemma 2, and letu © E,, be a unit vector. Then for 
every x € E., orthogonal to u we have | f(x), g(u))| < 3e. 


1 


Proof of Theorem 1: Let g: E, — E,, be as in Lemma 2. Since g™“ is an isometry, 


g ‘of is an e-isometry. Note that g~! >of sends zero to zero and 
_ (ge f)(2"*) 
lim a =X 


mo 
for all x. As it is enough to prove the conclusion for g~' ° f, we can assume with 
no loss of generality that g(x) =x for every x. 

First we show, using induction, the existence of a constant K (depending on n) 
such that || f(x) — x|| < Ke for all x. Let f be an e-isometry on E,. Since f(0) = 0 
we have ||f(x)| — |x||< e for all x. So, either | f(x) — x] < € or |f(x) + x1 < «. 
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For all x outside a large neighborhood of 0, only one of these can be true. It is 
now easy to find a constant K such that | f(x) — x| < Ke for all x. 

Assume now that we have already proved the assertion for n — 1 dimensional 
Euclidean spaces. Let x be any vector in FE, and let u be any unit vector 
orthogonal to x. By Lemma 3 with g(y) = y we have | f(x), u)| < 3e. Let P be 
the orthoprojector onto [u]+. For any w € [u]+ we define f,(w) = Pf(w). We 
claim that f, is a 7e-isometry on [u]* satisfying f,(0) = 0 and 

fiw) 
lim ———— =w 
moo 2” 


for all w. Obviously, f,(0) = 0. Next note that 
I A07) — fide’) |] =e — wl] 
=|l| fOr) — (f(w), wu — fw!) + (F(w'), wall - lw - w']| 
<|I|f(w) — f(w) || - [lw - w'] + 62 < Te.» 
Finally, 


Pw = 


fi(2"w) , PF(2™w) 
lim ——— = lm —— = 
By the induction hypothesis, there exists a positive constant K,_, such that 
Ifi(v) — wll s 7K,-16 
for all w € [u]-+. It follows that 
f(x) — xl] =i) + (f(x), wu — x < 1K, 16 + 3. 
Since x was an arbitrary vector, the induction step is over. 

Now we will show how to replace K by 2. Take any x € E, and set || f(x) — 
x|| = a. Assume that a # 0. Denote by y the unit vector satisfying f(x) — x = ay. 
The vector x can be written as x = x, + by, b © R, where x, and y are orthogo- 
nal. For every positive integer m we have f(x +my)=x+my +0,, where 
llv,,ll < Ke because of what we have shown in the first step. Write v,, = b,,y + U,»; 
b,, = R, where u,, and y are orthogonal. Consequently, ||u,,|| < Ke and |b,,| < Ke. 
Using the fact that f is an e-isometry with f(0) = 0 we have 


Ill F(x + my) || - [|x + myll| < e- 
This can be rewritten as 
(7 + b+ by + (Xo + Um) || =O + b)y + xl] < e- 
Since u,, is bounded and x, and u,, are orthogonal to y, this shows that for every 
pu > 0 we have |b,,| < «+ w if m is large enough. 
Since f is an e-isometry, we have 
m — © <||f(x + my) - f(x)||<m +e, 
or equivalently, 
m—e<|\(m-—a+b,)y+u,,||<m+te. 
For large m this norm can be brought as close to m — a + D,, as we wish. Since for 
large m we have |b,,| < « + mw with w being arbitrarily small, this is possible only if 
as2e. | 


We should remark that in the second part of our proof no reference was made 
to the finite dimensionality of the spaces involved. Thus, the factor 10 obtained by 
Hyers and Ulam (for the case of surjective isometries between infinite-dimensional 
Hilbert spaces) can be reduced to 2 using this argument. 
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It would be nice to extend Theorem 1 to e-isometries f: EK — F where EF and F 
are arbitrary n-dimensional real normed spaces. In this case we have the following 
substitute for Lemma 2: There exists an increasing sequence (m,) of positive 
integers such that 


a2) = fim 


(9) 


exists for every x € E. The mapping g is a linear bijective isometry. To prove this 
we first observe that the definition of e-isometry implies that the sequence 
(n~'f(nx)) is bounded for every x € E. We choose a dense subset {z,, z,,...} in 
E. Applying the Cantor diagonal procedure we can find an increasing sequence 
(m,) of positive integers such that 


f(m,Z,) 
= lim ~~” 
8(Zp) Low mM, 


exists for every positive integer p. Using the definition of «-isometry once again we 
see that (9) exists for every x € E. Clearly, g(0) = 0. To prove that g is an 
isometry we replace x and y in (2) by m,x and m,y, respectively. Dividing the 
obtained inequality by m, and sending & to infinity we conclude that g is an 
isometry. We have already proved that g must be linear. 
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How To Do MONTHLY Problems 
With Your Computer 


Istvan Nemes, Marko Petkovsek, Herbert S. Wilf, 
and Doron Zeilberger 


Fortunately, on 20 April 1977, all of this kludgery was rendered obsolete 
when I found a decision procedure for this problem. 
(A discrete analog to the Risch algorithm for indefinite integration.) 


—R. William Gosper, Jr., Indefinite Hypergeometric Sums in MACSYMA 
(1977) 


1. INTRODUCTION. The problem of finding simple evaluations of major classes 
of sums that involve factorials, binomial coefficients, and their g-analogues, has 
been completely solved. Sums that have the rather general form specified in 
Section 3 can all be done algorithmically, that is to say, you can do them on your 
own PC. Your computer evaluates the sum as a simple formula, if that’s possible, 
and gives you a proof that you can check, or gives you a proof that your sum 
cannot be “done” in simple closed form, if that is the case. 

We first briefly describe the algorithms and the theory that have achieved this 
goal. Second, to illustrate both the scope of the method and the fact that in some 
interesting cases human intervention still helps, we show how these computer 
methods would have fared in attacking 27 problems that have appeared over the 
years in the Problems section of this MONTHLY. 

It happens (coincidentally, of course) that three of the authors of this article 
(PWZ) have just written a book [8] that describes the theoretical foundations of the 
solution of this problem, and also gives the software by means of which everyone 
can perform these sums sans peine (almost). 


2. THE METHODS. The methods that have achieved the complete solution of 
this class of problems are the following: 


e Sister Celine’s method [1] 

¢ Gosper’s algorithm [3] 

¢ Zeilberger’s algorithm ct (“creative telescoping”) [11] 
¢ Wilf and Zeilberger’s WZ method [9] 

¢ Petkovsek’s algorithm Hyper [6] 


Here is a brief description of the scope of each of these algorithms (full 
descriptions are in [8]). Computer programs, in Maple or Mathematica versions, 
that carry out each of these algorithms are available free at 
http: // www.cis.upenn.edu/ ~ wilf / AeqB.html. 

Sister Celine’s algorithm has been superseded by faster ones, but her work 
contains the original ideas on which the later algorithms have built. What it does 
can be stated pretty simply: it finds recurrences for hypergeometric summands. 
The fundamental theorem of this subject, which we state precisely in Section 3, 
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holds that every proper hypergeometric summand does indeed satisfy a recurrence 
relation. For instance, if, under your summation sign, there lurks 


2 
n 
F(n,k) = (7 , 

then her method informs you that 
nF(n,k) — (2n—1)(F(n—-1,k) + F(n —-1,k —1)) 

+(n —1)(F(n —-2,k) —2F(n—-2,k —1) + F(n—-2,k —2)) =0. 
Why did you want to know that? Well, if you sum this recurrence over all integer 
k, you'll find immediately (try it!) that the sum f(n) = E( 7] satisfies f(n) = 


2(2n — 1)f(n — 1)/n, and so by induction, f(n) = °n), and you have evaluated 
your sum. But, you say, you already knew that the sum of the squares of the 
binomial coefficients is on ? Sure you did, but the same method works on any 


sum of factorials and binomial coefficients and powers in the world, provided it’s of 
the form described in Section 3. So it wasn’t finding that one particular sum that 
was the revolutionary event. It was the fact that Sister Celine’s method can find 
recurrences satisfied by any one of a huge class of summands, and, as was realized 
much later, from the recurrence for the summand there comes the recurrence for 
the sum, and from that comes the closed form evaluation of the sum, if it has one. 
We now have algorithms that handle all of those pieces. 

Gosper’s algorithm completely solves the problem of indefinite hypergeometric 
summation. Given a summand F(k) that is a hypergeometric term in k (ie., 
F(k + 1)/F(K) is a rational function of k), Gosper’s algorithm finds a hypergeo- 
metric term G(k) such that F(k) = G(k + 1) — G(k), if one exists, or prove that 
none exists, if that be the case. Thus it solves the discrete analogue of the 
antidifferentiation problem: instead of exhibiting a given integrand as the derivative 
of something, thereby enabling integration in finite terms, it exhibits a given 
summand as the difference of something, thereby enabling summation in finite 
terms. Examples of the operation of this algorithm are in Section 5. 

Zeilberger’s algorithm ct finds a recurrence for a given hypergeometric sum- 
mand F(n,k). To that extent, it solves the same problem that Sister Celine’s 
method solves. The form of the recurrence that it finds is different, however, and 
that allows an enormous speedup in its operation time. His algorithm finds a 
recurrence for F(n, k) in the form 

d 
a(n) F(n + j,k) = G(n,k + 1) — G(n,k), (1) 
j=0 
in which G/F is a rational function (which the output exhibits) and the a,(n)’s are 
polynomials in n. The power of this result derives from the fact that if we sum both 
sides of this recurrence over a certain range of k, the sum on the right side 
telescopes, and so is easy to handle, and we obtain a recurrence for the sum, 
»,F(n, k), that we are trying to deal with. The fundamental theorem guarantees 
that such recurrences always exists if F is a proper hypergeometric summand (see 
Section 3). 

Wilf and Zeilberger’s WZ method is at once a special case and a generalization 
of Zeilberger’s method. In order to prove an identity of the type U, F(n, k) = 1, it 
finds a recurrence of the form 


F(n+1,k) —F(n,k) = G(n,k + 1) — G(n,k), (2) 
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where G/F is a rational function called the proof certificate of the identity. This 
form is clearly a special case of (1) above. A recurrence of this form does not 
always exist. When it does, one gets two benefits: first a very short proof of one’s 
summation identity, and second, because of the symmetry of (2) in F and G, one 
finds a new identity, involving G, from the original one, involving F. 
Petkovsek’s algorithm Hyper finds closed form solutions f() to linear differ- 
ence equations with polynomial coefficients, 
d 
Lan) f(n +j) = 9, 
j=0 
when such solutions éxist, or it’proves that they do not exist, when they do not. We 
use the phrase “closed form” in the following precise sense: f(n) is said to be of 
(hypergeometric) closed form if it is equal to a linear combination of a fixed 
number, r, say, of hypergeometric terms in n. Thus Hyper completes the job of 
doing the summation problem because the methods just described, while they are 
guaranteed to give you a recurrence for your unknown sum, are not guaranteed to 
give you one of minimum order! But Hyper knows how to solve such recurrences 
in closed form, or to prove the impossibility of solving the recurrence in closed 
form, if that be the case. 


3. THE THEORY. Are these algorithms just more tricks, that might or might not 
work, to try on sums? Quite the contrary. In fact, the algorithms are accompanied 
by theorems that precisely describe circumstances under which they are guaranteed 
to work. So these are definitely not of the let’s-see-if-it-works genre. They will 
work if the hypotheses of the relevant theorems are satisfied. 

We are talking about sums of the form f(n) = L?2@).,,F(n, k). The whole 
method rests on the fact that if F(n, k) is a suitable summand then it satisfies a 
recurrence relation of a certain form. A summand F(n,k) is suitable (proper 
hypergeometric) if it is of the form 
Fin. ky =Pln.k jy(ant+bk+ec,)! | 3 
,k) = P(n, k)———— 7x"J""," 

(1) (1) I1y_,(ujn + uk +w,)! » (3) 
in which 


¢ P(n, k) is a polynomial in n and k, whose degree is a specific integer, and 
¢ the limits /, J on the products are fixed specific nonnegative integers, and 
¢ the quantities a,,b,, U;, v; are specific integers, and 

¢ the quantities c;, w;, x, y may depend on parameters. 


Suppose we have a summand of that kind. What can we expect? 
Theorem 1. Jf F(n, k) is proper hypergeometric then there exist a nonnegative integer 
d, a rational function R(n, k), and polynomials {pAn)}#o, independent of k, such 
that F(n, k) satisfies 


d 
2 pi(m)F(n + j,k) = G(n,k + 1) — G(n,k) 
j= 


where G(n, k) = R(n, k)F(n, k). 


This theorem of Zeilberger, and the creative telescoping algorithm that 
carries it out, are used to find recurrences for given sums. The existence part of 
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the proof follows from an earlier algorithm of Sister Mary Celine Fasenmyer that 
was used by her to find recurrences for hypergeometric polynomials. 

From the recurrence for the summand one gets a recurrence for the sum. From 
the recurrence for the sum one gets the evaluation of the sum in closed form, if 
possible, or a proof of impossibility. The latter follow from algorithm Hyper, if the 
recurrence obtained is of order greater than 1. Just as the problem of finding 
recurrences has a life of its own, aside from its uses in evaluating sums, so 
algorithm Hyper has a life aside from finding out if sums have closed forms. 
Combinatorics is full of enumeration problems that lead to recurrences. With 
Hyper we can now find solutions of these, or else prove that closed forms do not 
exist, for the first time. In this way a large number of combinatorial sequences have 
been proved not to be of closed form, such as those in the following theorem. 


Theorem 2. None of the following famous sequences can be expressed in hypergeomet- 
ric closed form: 


¢ the sum of the cubes (also the fourth and fifth powers) of the binomial coefficients 
of order n, 

¢ the number of 3 X n Latin rectangles, 

¢ the number of involutions on n letters, 

e the derangement numbers, 

¢ the sum of the first n of the binomial coefficients of order pn (p > 2) [7], 


For the whole story of this remarkable current of matheniatical thought, see [8]. 


4. THREE RECIPES FOR SUCCESS. Given a sum S(n) = 27%), F(n, k) with a 
proper hypergeometric summand, Zeilberger’s algorithm ct yields a linear recur- 
rence relation & with polynomial coefficients, of order d = 0, satisfied by S(n). 
This is very helpful in the following situations that interest us here (and in many 
other situations too): 


1. To prove that S(n) = ¢, where t, is given in closed form, simply verify that 
t, also satisfies , and that it agrees with S(n) for d sufficiently large 
consecutive values of n. 

2. To prove equality of two such sums use algorithm ct on both, and find a 
common multiple, .Z, of the two resulting recurrences. If the order of Z is 
m, verify that the two sums agree for m sufficiently large consecutive values 
of n. 

3. To find a closed form evaluation of S(), note first that if d = 0, ord = 1 
and & is homogeneous, then such an evaluation is immediate from &. 
Otherwise, for various special reasons we might be able to solve & by 
inspection; it might be homogeneous with constant coefficients, for instance. 
But if no solutions are immediately apparent, then 
(a) If d= 1 and /& is inhomogeneous, then S(n) can be expressed in terms 

of an “indefinite” sum which Gosper’s algorithm will put into closed 
form provided such a form exists. 
(b) Otherwise, use Hyper to find all closed form solutions of &. Homoge- 
nize first if & is inhomogeneous. If you are lucky and & is satisfied by 
t > 0 linearly independent hypergeometric terms, then: 
i. If ¢ = d, any solution of & can be put into closed form by choosing 
an appropriate linear combination of hypergeometric solutions. 
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i. If ¢ <d, try to find a linear combination of hypergeometric solu- 
tions that agrees with S(n) for d sufficiently large consecutive 
values of n. 

iii. If this fails, use hypergeometric solutions to reduce the order of /. 
Repeat these steps with the new recurrence. 


This procedure is guaranteed to decide whether S(n) has a closed form evalua- 
tion (and to find it when it exists) whenever F(n, k) is proper hypergeometric, and 
the limits of summation are either infinite (recall that summands often have 
compact support) or linear in n. But sometimes, with a little help, it works even 
when F(n,k) is not proper hypergeometric (cf. problems E 3258, 10206, 10223, 
10388 in the next Section). We refer the interested reader to [8, Chapter 8] for 
more details. 

During the reviewing of this paper, one of the readers asked us for an example 
of a problem that we had not been able to do by these methods, even though it 
may have appeared to be a candidate. Of course such a problem would have to 
violate the hypotheses of Theorem 1, while at the same time seeming, at first 
glance anyway, to satisfy them. A good example of such a problem is an identity 
whose truth was conjectured by Borwein and Bradley, and which has recently been 
proved by Almkvist and Granville. It states that 


5 k* = k=1 pt — j* 1 
= (2) rake UW aarap Je (ne). 
242, \ k / 4n* +k 4n++j* on 


After factoring the fourth degree polynomials that appear in the summand one 
discovers that it has exactly the form (3), except that, for instance, one of the 
numbers a, is y— 1, which is not a specific integer, so the conditions are not 
satisfied. 


5. PROBLEMS AND SOLUTIONS. We looked through MONTHLY problems on 
sums and recurrences that have been published since 1978, and selected 27 of the 
kind we’re considering here. We used algorithms ct and Hyper, following the 
recipes given in the previous section. Wherever possible we used Gosper’s algo- 
rithm and the WZ method, which technically are special cases of Zeilberger’s 
algorithm ct corresponding to d=0, and to d=1 with given closed form 
evaluation, respectively (d being the order of the resulting recurrence). Besides 
our own implementations, we used the outstanding implementation of Zeilberger’s 
algorithm in Mathematica by P. Paule and M. Schorn [5], which excels especially 
when the resulting recurrence is not homogeneous. 

Many of the problems were solved completely automatically, while others 
required a little human help. For example, in several sums that involve the floor 
function we humans carried out the replacement 


F(a, k, | k/2]) == Y (F (1, 2k, k) + F(n,2k + 1,k)) (4) 
k k 


in which, if F(n,k, m) is hypergeometric in n, k, m, the summand on the left is 
not hypergeometric, but the one on the right is. Other examples of human 
intervention include the choice of the “best” recurrence variable when the sum- 
mand depends on more than one parameter, etc. 

A notable exception in the amount of necessary human aid is the double sum in 
Problem E3376, which required the sharp eyes of P. Paule [4] to notice a special 
relationship among the coefficients of the recurrence. In principle, of course, 
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multiple sums can be handled by the methods of [10], but it is nice to be able to 
“do” a double sum with single-sum methods. 
And now, here are the problems and their solutions! 


Problem 6407. (Proposed in 1982, p. 703; solution in 1984, p. 315) 


Define A by means of the relation 


| = Flak Fae = (G" - 1)(q""* - 1) + (qr *t! — 1), 


so that | is the so-called Gaussian polynomial. Prove the identity 


k-1 
mn g* 7 (-1) n 
y) — y) sgh nr] | (5) 
gr i-@q®  ~ 1-4* k 
Denote the right side of (5) by S(n). The g-version of algorithm ct yields the recurrence 
S(n) — S(n — 1) = q"/( — q”), which is also satisfied by the left side of (5). As they agree 
for n = 0, the identity is proved. 


Problem E 3021. (Proposed in 1983, p. 645; solution in 1986, p. 652) 
Let 


n 


p(x) = OR) +ayia ay, (6 


k=0 


Express p,(x) as an explicit function of 1 — x’. 
We provide a partial solution as follows. By algorithm ct 


A(n + 1)x*p,(x) — 2(2n + 3)pasi(*) + (1 + 2)Ppa2(%) = 0. (7) 
With po(x) = 1 and p,(x) = 2, we see from this recurrence that p,(x) is a polynomial in 
x*, and hence in 1 — x”. By comparing (7) with the three-term recurrence 


(n + 1)P,(x) — (2n + 3)xP,4,(%) + (n + 2)P,,.(%) = 0 
satisfied by the Legendre polynomials P,(x), we find that p,(x) = (2x)"P,(1/x). 


Problem E 3022. (Proposed in 1983, p. 645; solution in 1986, p. 736) 
Show that, for any a > 0 and any positive integer N, 
N k Nol k+1 
k-1{ N 
yo ({isg-pe O gas | 
k=1 ( )a k= a 
We are to show that £, F(N, k) = 1, where 
k-1;_-1\N 
(-1)" (a) 
(kK-1)'"(N-k)(kK-14+a7')’ 


and (x)” is the rising factorial. The WZ method does this with the rational proof certificate 
(k — 1a71 +k -— 1)/(N(k — N — 1)), and a check of the case N = 1. 


F= 


Problem E 3065. (Proposed in 1984, p. 649; solution in 1987, p. 378) 
Let n = 0 be any integer and let k be any integer such that k =n + 1. Then find a closed 


formula for 
" (=1)' (k\[k-1-j 
Earl n-j . 
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Let S(n) be the sum in question. Algorithm ct finds that 


S(n) + S(n+1)= (ta) 


n+2 
whence 


S(n) = (-1) [3 ~ »(#) +s) =(-'S oye) 


(J +1 


Gosper’s algorithm gives the final answer 


A(n ta) * (- 0" 


Problem E 3088. (Proposed in 1985, p. 359; solution in 1987, p. 685) 


Show that, for every positive integer n, 


3 k +k! (") 
=n. (8) 
k 
ay mY NK 
Let ¢, denote the summand in (8). Gosper’s algorithm finds that ¢, = 5,4, — s, where 
S, = —nt,/k. Summing this recurrence on k from 1 to n gives the sum as s,,, — Ss, =n. 


Problem 6519. (Proposed in 1986, p. 403; solution in 1988, p. 156) 
Let 
_ —(atmt+n—2k\[(atn\(b+m 
F(a, bmn) = ¥ | n—k | k (e%) ) 


where m and n are nonnegative integers. Show that F(a, b, m,n) = F(b, a, n, m). 
For F(a, b, m,n) and F(b,a,n,m) we compute recurrence relations with respect to n 
using algorithm ct. As it turns out, both sums satisfy the same recurrence of order 4: 


—3(1+at+n)\(2+at+n)j(3+a+n)S(n) 
+2(2+a+n)\3+at+n)(10+a+b—2m + 4n)S(n + 1) 
(3 +a +n)(53 + 5a + 5b — ab — 26m — 4am — 4bm 

+36n + 2an + 2bn — 8mn + 6n*)S(n + 2) 
—(7+a+b+2n)(1 + 4a+ 4b+ab+ 8m+am 

+bm + an + bn + 2mn)S(n + 3) 
+(4+n)(4+b+n)\(4+a+b+n)S(n + 4) = 

Checking that F(a, b, m,n) = F(b, a,n, m) for 0 <n < 3 therefore completes the proof. 


Problem E 3190. (Proposed in 1987, p. 181: solution in 1988, p. 877) 
Show that 


(nw -2(/) 
1 (N=r)(N=r-j) 


j 
de = 0 (10) 
forj >0 and N > 2). 

Let ¢, denote the summand in (10). Gosper’s algorithm finds that t, = s,,, — 5, where 
s.=r(r+j —N)t,/(j(N — 2r)). Summing this recurrence on r from 0 to j — 1 gives the 
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Problem E 3207. (Proposed in 1987, p. 456; solution in 1990, p. 67) 


If m is a positive integer, let 


Show that 
F,,-1(*) — F(x) = omy (2 |x" —x)” form> 1. (11) 


Zeilberger’s algorithm ct instantly yields (11). 


Problem E 3258. (Proposed in 1988, p. 259; solution in 1989, p. 651) 


Prove that 
> [FJ J | =(2n+1 n). 
BYP rl 
If we use the transformation (4) here, the sum in question becomes 
| jar " ‘). 
j 2j7+1\ 2) J 
an _ !), the WZ method proves the 
identity with a proof certificate of 4j(j + 1)/((2n + 3X2j —n — 1)). 


If we divide the summand by the claimed right side, 


Problem E 3335. (Proposed in 1989, p. 525; solution in 1990, p. 927) 


Solve the recurrence 
Xq =Q, X, =D, X42. =Xn4, +X,/(4 +1) forn =0,1,2,... 
both exactly (in terms of familiar functions of n) and asymptotically. 
Algorithm Hyper gives one solution, m + 1. Then by reducing the order we find that 
(n + 1I)LR_.(-D*/(k + 1)! is another. So 
n+1 (- 1)* 
x, =(n+1)l[a+(b-2a) > ; 
k-0 


which is asymptotic to (n + 1)(a + (b — 2a)/e). 


Problem E 3352. (Proposed in 1989, p. 838; solution in 1991, p. 369) 
Show that 

° 1 e 
Ty = 5° 

nao Mn +n* +1) 2 

This is equivalent to 
ia 1 1 
El eye grey are eee er) 
nao \ni(n* +n* +1) 2n! 


= 0. (12) 
Let ¢, denote the summand in (12). Gosper’s algorithm finds that t, =s,,, —s, where 
s, =n’ /(2n\(n* —n + 1)). Summing this recurrence on n from 0 to © gives the sum as 
Soo — Sq = 0. 
Problem E 3376. (Proposed in 1990, p. 240; solution in 1992, p. 63) 

Prove that 


N N /[; .\2 25-2) 2 
L x ( | ay | -QN + (2). 


2 


for any positive intege 
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Let S(N, i) denote the inner sum on the left. Algorithm ct finds the recurrence 
3 


dy DN, i)S(N,i +7) = 0 (13) 
j=0 


where 
po(N,i) = (1 + i)(-i + N)(-1- 21 + 2N), 
p,(N,i) = —18 — 32i — 22i7 — 61° — 11N + 4iN + 817N — 30N? — 20iN’, 
p(N,i) = (2 + i)(27 + 231 + 617 + ON — 4iN + 18N7), 
p3(N,i) = -2(2 + i)3 +i). 


Following Paule [4], we notice that L7_) p,(N,i —j) = —2(2N + 1)? is independent of i. 
Summing recurrence (13) on i from —3 to N, and changing the order of summation to take 
advantage of this, we obtain 


3 3-1 


> » (S(N, —i)p,(N —i-j)+S(N,N+ i)pj+i(N, N — j)) 
i=1 j=0 


—2(2N + 1) > S(N, i) = 0. (14) 
i=0 


Since for positive integer i the sums S(N —i) and S(N, N + i) contain only i nonzero 
terms, the result 


rs = (2N + (24”) 


can be readily computed from (14). 


Problem E 3439. (Proposed in 1991, p. 437; solution in 1993, p. 188) 
If M and N are nonnegative integers, prove that 
M+N\ _ M-a-1\{Nt+a M-—a\(Nt+a 
a een re re 


0 M-1 
<as 
Sas >) 


If M = 0 both sides are 1. When M > 0 the two sums on the right can be combined into 
a single hypergeometric sum 


M-1 eee eee ee 


SON) = 2b Gas —a) 2a 


a 
for which creative telescoping finds the recurrence 


(N + 1)S(N +1) -(M+N+41)S(N) =0 


satisfied by (™ aN ) As the two sides of (15) agree at N = 0, the identity is proved. 


Problem 10206. (Proposed in 1992, p. 266; solution in 1995, p. 657) 


If m and k are positive integers, prove that 


r(.",)(") _ 5 (WAl)(m-e= bil), (16) 


j 
If we apply the transformation (4) to the sum on the right of (16) it becomes 
ee ere jt+1 3] 


Qi+)G+) k -2j 2j (17) 
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The creative telescoping algorithm with respect to k shows that the same recurrence 
(k — 2m)S(k) + (1 —m+k)S(k +1) + (Kk + 2)S(k + 2) =0 
is satisfied both by (17) and by the left side of (16). Since they agree for k = 0,1, the identity 
is proved. 
Problem 10223. (Proposed in 1992, p. 462; solution in 1997, p. 70) 
For p © R, gq = 1 — p, and positive integers n, prove 


an—-1 k 
(Ro \(prat + perat) = 1, 
k=n 


Write the sum as S,(p)+S,(1 — p) where S,(p) = aad a ‘|p "(1 — p)*-". By 
algorithm ct 


2p-1 nl 2n 
Sn+i(P) — SCP) = 5 (pi -p)) (7), 


therefore 


— 2p 2n 
Spi(l =p) ~ 4 =p) = "(v0 = ))"(7"), 
which implies that S,(p) + S,(1 — p) is constant. Evaluating the sum for n = 1 completes 
the proof. 
Problem 10229. (Proposed in 1992, p. 570; solution in 1994, p. 797) 
Given that m and p are integers with m = p = 1, evaluate 


P 
Elmar i](m} a 


m+] 


Let t; denote the summand in (18). Gosper’s algorithm finds that t; = s,,, — s; where 
= (j — Ij + m\2Qm — 27 + Wt,/(m2m + 1)). Summing this recurrence on j from 1 to 
p gives the sum as s,,; — 5; which is 


1/2 1/2 \p(m—-p+1)(2m + 2p -1) 
m—-pt+i1}]\m+p m(2m + 1) 
Problem 10332. Proposed in 1993, p, 796; solution in 1996, p. 702) 


If n and k are integers with 0 < k <n, prove that 


(ek) > yen alan 


If we divide the summand on the right by the claimed left side, a aE the WZ method 
proves the identity with a proof certificate of 4j(j + k)/(2j +k —n — 1)2n + 1)). 


Problem 10357. (Proposed in 1994, p. 75; solution in 1997, p. 177) 
Define integers ay, , by 
1 


1-—u—v+2uv 


io 2] 
= a, uv" 


m,n=0 


Show that (—1)!a, 9342 is the Catalan number (! /(j + 1). 
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We expand the function on the left into a power series in u and v, using first the 
geometric series, then the Binomial Theorem, and finally the derivatives of the geometric 


series: 
1 1 1 © (1—2v\” 
oun 2w yf, ta -7 (7S) « 
1l-—v 
LElt-rale- Elta 
1 — U n=0 1—v m,k=0 k i — py)" 
— oe 7 yyk{m)\(2) mon 
- 2 oT) (eer 


From this we see that a,,,, = Ly( -v(7% (i): Using Zeilberger’s algorithm ct on 


S(j) = (- ag; 9j49 = E—p(7\(7? "| we obtain the recurrence 
16(1 + j)(1 + 2j)(9 + 47)S(7) — 207 + 4j)(21 + 287 + 877) SCj + 1) 


+3 +/)(5 + 2S + 4/)SCi + 2) = 0, 
which is satisfied by c,; = (7) /(j + 1). As S(j) = ¢; for j = 0,1, the proof is complete. 


Problem 10363. (Proposed in 1994, p. 175; solution in 1997, p. 179) 


If m, n are integers satisfying 1 < m <n — 1, prove that 


(pro _ (" ‘| _ 2 1 ed | aa 


mol fk +i Poa ky | 
S(k) = . 
(kK) x | k 4(n -m—k-1) 


Let 


Algorithm ct finds the first-order recurrence 
(2m — 2n +k +1)S(k) + (2n —m—k — 2)S(k + 1) = 0, 


for0 <k <m—n — 2, From this, S(k) = C(n, m){ >" =n mn ~ ant k + '), where 


k 
m1 (2n-m—-j-3\  (2n-m-2 
comm = 80) = Sy) =| m-1 ) 
j= 
by Gosper’s algorithm. Finally, 
; ; (em pn *k) 
n-m—~ n-m— 
2n-—m-—2 k (Prom 1) ("7") 
S(k) = _ — ; 
X (k) ( m-1 } x [mans k et) m m 
k 


by Gosper’s algorithm again. 


Problem 10375. (Proposed in 1994, p. 362; solution in 1997, p. 275) 
Find the complete solution of the recurrence 
U,.» = 2(2n + 3)U,,, —4(n + 1)°(2n +: 1)(2n + 3)U,, forn=0. 


Petkovsek’s algorithm Hyper finds that (27)! satisfies the recurrence. Then by reducing 
the order we find that (2n)!H,, also satisfies it, where H, = 1+ 1/2 +++: +1/n is the n-th 
harmonic number. Hence the complete solution is U, = (2n)!(C, + C,H,,) where C, and 
C, are arbitrary constants. 
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Problem 10388. (Proposed in 1994, p. 474; solution in 1997, p. 459) 
Find 
n— 3 k 


n -—7z tp 
(r)[ 4 > 2 
where n and p are positive integers. 
The summand is not hypergeometric due to the non-integral coefficient of k. Denote the 
sum by S(n, p), and write S(n, p) = S,(n, p) + S,(n, p) where 


L 


n 
k=0 


n 
—— -k+ 
Si(n, p) = ¥( 3] 4 P 5) 
k 2 


n—3 k 1 

n —— -k-—+ 
S,(n, p) = E (9474) 4 2°? 
k 


2p 
Zeilberger’s algorithm ct finds that both sums satisfy the same recurrence with respect to 
D, Viz.; 


(n — 4p — 7)(n — 4p — 5)(n — 4p — 3)(n — 4p — 1)5,(1, p) 

—16(n — 4p — 7)(n — 4p — 5)(5n + 4pn — 8p? — 20p — 14)S,(n, p + 1) 

+512(n — 2p — 3)(n — 2p — 4)(2p + 3)(p + 2)S,(n,p + 2) =0, fori € {1,2}. 
(19) 


Then S(n, p) also satisfies (19). Algorithm Hyper finds the complete solution of this 
recurrence as 


n—3 n—1 n— 3 n—1 
| 4 4 4 4 
Pp Pp Pp Pp 
= +E tC 
S(n, p) = Cn) a + C,(ny t= 
(-4)"| 2 2 4P| 2 
Pp Pp Pp 
The initial conditions S(n,0) = 2” and 
hn n-3 k n—3 
S(n,1) = ¥ (7) 4 ~3t! = 92" 
k=0 9) 
can be found using algorithm ct again. From these C,(n) = 0 and C,(n) = 2”, whence 
n—3 n-1 
4 4 
n 
Pp Pp 
S(n, DP) ~~ 4P n—- 1 
2 
Pp 


Problem 10396. (Proposed in 1994, p. 681; solution in 1997, p. 570) 


Let a > 0 and let (b,:n = 1) be defined recursively by b, = a, b, = 3a, 
bi, = (2n + 1)b, — (n* + a@*)b,-, (n= 2). (20) 


Prove that (b,) contains infinitely many positive and infinitely many negative terms. 
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Algorithm Hyper finds the complete solution of (20) as C,(1 + ai)(2 + ai):::(n + ai) 
+ C,(1 — ai(2 — ai):+:(n — ai) where i* = —1 and C,, C, are arbitrary constants. From 
the initial conditions, C, = 1/(2i) and C, = —1/(2i), so that 


b, = 3(1 + ai)(2 + ai): (n+ ai). 


Write b, = 3z,, where z) = 1 and z, = (n+ ai)z,_, (for n = 1). As lim, _,,.arctan a/n 
= 0 but L,_,arctan a/n = ™, it is clear that the imaginary part of z, is positive and 
negative infinitely often. 


n 


Problem 10403. (Proposed in 1994, p. 792; solution in 1997, p. 368) 
Define a sequence <y,,) recursively by yy) = 1, y; = 3 and 
Yn+1 — (2n + 3) Yn 7” 2NYn—1 + 8n (21) 


for n = 1. Find an asymptotic formula for y,,. 
Algorithm Hyper finds that 2”n! satisfies the homogeneous part of (21). By reduction of 
order we obtain 


1+ 8D%_im n 1+ 4k(k -1) 
y, = 2"! a = 2m) YY — 
coo Loo 2*k! 
n 1 
— pnt+l,} 
2° n p> FEI 2n — 1, 


which is asymptotic to 2"*!nlye . 


Problem 10424. (Proposed in 1995, p. 70; solution in 1997, p. 466) 


Evaluate the sum 


y ("a"): (22) 
k< 


Denote the sum in (22) by S(n). The creative telescoping algorithm yields the constant- 

coefficient recurrence 

S(n + 3) — 2S(n + 2) + S(n +1) -—28S(n) =0 (n 21). (23) 
The roots of the characteristic polynomial are 2, + i, and the solution of (23) satisfying 
S(1) = S(2) = 1 is 

0, n=1 (mod2) 
=2”7-14(-1, n=2 (mod4). 

1, n=0 (mod 4) 


n—1)a7 
S(n) = 2"71 - sin”) 
2 
Problem 10466. (Proposed in 1995, p. 654; solution in 1997, p. 575) 
For x € C andn € N, prove the following identities between polynomials: 
nvr (¥t2\|(n-1-x (2") (x tj\( x-j 
—4 |= ; ; 
(a) ( rs j | 2n—] n » 2] 2n — 2j 


(b) Forallm €N, with 0 <m < 2n, generalize (a) to 


n-|—| 
ne x+35 n-1-x)\_ [2n 2 x+] x-J 
(—4) z| j | 2n —j } = (2") Le Leda om 27] 
j= Z! 
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(a) Denote the sums on the left and right by S(m) and T(n), respectively. Algorithm ct 
finds recurrence relations 


(1 + 2n)S(n) - 21 +n)S(n +1) = Car (Bey 


l+n 2n | (24) 


and 
(1 +n)(2+n)(1 + 3n — 2x)T(n + 2) 
—(1 +n)(6 + 41n + 67n? + 30n° 
—x(28 + 98n + 68n? — 40x — 56nx + 16x*))T(n + 1) 
+2(1 + 2n)(4 + 3n — 2x)(x — n)(2x — 2n — 1)T(n) = 0. 
The latter recurrence turns out to be the homogenization of the former, so S(n) and T(n) 
satisfy the same recurrence of order 2. As they agree for n = 0,1, they are identical. As a 


bonus, from (24) we can express S() in terms of an “indefinite” sum in which the summand 
does not depend on n: 


Cr] 
nal 3k +1-2x 
n 2x 
S = 1-—(2x+1 4k —____________ ; 
(1) = | (2% ) (k + 1)(2k + 5 (33)] 
(b) Denote the sum on the right by U(n, m). Algorithm ct with respect to m finds the 
recurrence 


(2n —m — 1)U(n,m + 2) — 2nU(n,m + 1) + (m+ 1)U(n,m) = 0. (25) 


_ on n—-1 x +] Xx —J . * 
For U(n, 1) = ( ey +1 ., _4- »;) algorithm ct finds the same recurrence as 


for T(n) (for n => 1). As they agree for n = 1, 2, they agree for all n = 1. So U(n, 1) = T(n) 
= U(n,0) (for n > 1). It follows from (25) that U(n, m) = T(n) = S(n). 


Problem 10473. (Proposed in 1995, p. 745; solution in 1997, p. 371) 


Prove that there are infinitely many positive integers m such that 
L 4 (2m4+1 
—— 3* 26 
sam (oe |) 2) 
is an odd integer. 
Denote (26) by S(m). Algorithm ct yields the constant-coefficient recurrence 
S(m + 2) — 4S(m + 1) + S(m) = 0. (27) 
The sequence T(m) = 5S(m) satisfies (27) as well and starts out as (1,5,... >, hence it is 
integral. Let T;(m) = T(m) mod 5 and T,(m) = T(m) mod 2. Using (27) mod 5 and mod 2, 
respectively, we see that 7; = (1,0,4,1,0,4,...) and 7, = (1,1,1,...)>, so that S@Gk + 1) 
= T(3k + 1)/5 is an odd integer for all k = 0. 


Problem 10494. (Proposed in 1996, p. 74; solution in 1997, p. 371) 


For each positive integer n, evaluate the sum 


~ k| 4n 2n 
Ee 9'(37)/(72). (78) 
Let t, denote the summand in (28). Gosper’s algorithm finds that t, = 5,,, — 5s, where 


s, = (2k — 1)t,/(2U — 2n)). Summing on k from 0 to 2n — 1 gives the sum as s,,, — Sg + 


6. CONCLUSION. Quite often, MONTHLY problems require evaluation of a single 


or double sum in closed form, or a proof of equality of two such sums. When the 
summand involves binomial coefficients, factorials, products of rational functions, 
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and exponential functions with constant base, there are very good chances that 
such a problem can be solved automatically by Gosper’s algorithm or by its 
generalizations: the WZ method, Zeilberger’s algorithm ct, and algorithm Hyper. 
Although Gosper’s algorithm is now over 19 years old (see the quotation on the 
title page!), it seems that it is not as widely known as it deserves to be. 

To help spread the word, we surveyed MONTHLY problems that have appeared 
since the publication of Gosper’s algorithm in 1978. We have presented here a 
selection of those on which these methods are successful. For a similar list of 
earlier problems, see the Web site 
http: // www.math.temple.edu/~ zeilberg/Monthly.html. 
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Simultaneously Symmetric Functions 


Lawrence W. Baggett, Herbert A. Medina, and Kathy D. Merrill 


1. INTRODUCTION. Symmetry properties of functions are traditional tools used 
to simplify the processes of both graphing and integration. In addition, a knowl- 
edge of the symmetry properties of initial conditions and boundary data provides 
qualitative information about the time evolution of solutions to differential equa- 
tions [3]. The most immediately useful types of symmetry are contained in the 
notions of even and odd functions. We make these notions precise with the 
following definition. 


Definition. A real-valued function w is said to be even on an interval [a, b] if 
wla +x) = wb —x) for all x €[0,b —al]. The function is odd on [a,b] if 
wla +x) = —wlb — x). 


- Sometimes we will call a function even or odd if the corresponding equality 
holds almost everywhere. Since such precision is not critical to our results, we will 
frequently omit the phrase almost everywhere in this and other contexts. In 
particular, this gives us license to ignore the endpoints of intervals. 

Any function w on [a, b] can be broken down uniquely into the sum of an even 
function and an odd function by writing 
w(x) +w(a+b-x) w(x)-w(at+b-x) 

‘ ) 9) ? 
This representation is unique since only the identically 0 function is both even and 
odd on [a, b]. Thus we can speak without ambiguity of the even part w, and the odd 
part w, of a function w on [a, b]. 

Our ability to break any function on [a, b] into its even and odd parts leads to 
breaking real function spaces on [a, b] down into the even functions on [a, b] and 
the odd functions on [a,b]. Since any even function is orthogonal to any odd 
function under the inner product (v,w) = [’v(x)w(x) dx, this breaks the real 
inner product space L’[a, b] into a direct sum of orthogonal complements. This 
process is implicit in many of the traditional bases used for L7[a, b]. For example, 
the Fourier basis for L*[— 7,7], consists of the (even) cosine functions and the 
(odd) sine functions. The simplest polynomial basis for functions on [—a, al, 
{1, x, x’,...} is traditionally broken down into the odd powers, which are odd 
functions on [—a, a], and the even powers, which are even functions. As a final 
example, the Legendre polynomials, which arise from applying the Gram-Schmidt 
process to the powers of x restricted to any interval [a, b], also have the property 
that the even degree members are even functions and the odd degree members are 
odd functions on that interval. 

In this paper, we will be concerned with the possibility of simultaneous symme- 
try on several intervals. That is, we cut an interval [a, b] into two pieces [a, c] and 
[c,b] and ask two questions: First, under what conditions can a function be 
simultaneously even (or simultaneously odd) on the whole interval and on the two 
pieces? Second, under what conditions can a function be simultaneously even (or 


w(x) = x € [a,b]. 
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simultaneously odd) on two of these three intervals? Since we can think of L?[a, c] 
and L’[c,b] as being contained in L’[a,b], we are in effect asking about the 
intersection of the three even or three odd subspaces. 

One way to picture these questions physically is by imagining standing waves on 
a string whose ends are fixed at the points a and Db. If the midpoint of the string is 
constrained to be a node, the resulting wave pattern is forced to be odd on the 
whole string. Now choose two other points a < p <q <b so that the distance 
between them is the sum of the distances to their respective ends: g — p= 
p—a+b-—dq. These two points are then the midpoints of two subintervals, which 
form a subdivision of our interval. If we constrain the string to force these points to 
be nodes as well, we force the wave pattern to be odd on the two subintervals of 
which they are the centers. Is this possible no matter how we choose our points p 
and q? In this paper we ask first the analogous question about functions that are 
allowed to be more complicated than standing waves. We then relax our require- 
ment to demand only that two of the three midpoints be nodes. 

These questions might be mere curiosities except that their answers have strong 
connections to the field of ergodic theory, which studies the long-term behavior of 
dynamical systems. A mapping T from a measure space X into itself is called 
ergodic if it mixes the space X up effectively in the following sense: any measur- 
able function w: X — R that satisfies w(7x) = w(x) for almost every x € X (we 
say that such a w is invariant under T), must be constant. The answer to our first 
question is related to the ergodicity of a natural class of mappings from the 
interval [a, b] into itself that come from bending [a, b] into a circle by identifying a 
and b, and then defining 7 to be rotation through a fixed angle. These rotations 
are ergodic exactly when the rotation is through an angle irrationally related to 
27r; on [a, b], this happens when the rotation is induced by a number irrationally 
related to b — a ((4], [5]). In Section 2, we see that nonconstant functions exist that 
are simultaneously even (or odd) on all three intervals if and only if the rotation 
that takes a to the division point c is not ergodic. Further, when nonconstant 
simultaneously symmetric functions do exist, they are precisely the invariant 
functions that cause the rotation to be non-ergodic. Thus, simultaneous symmetry 
on three intervals gives us a simple geometric way of picturing ergodicity of 
rotations. 

In Section 3, we see that simultaneous symmetry on two of the three intervals 
has a similar interpretation, which depends on a related functional equation from 
ergodic theory. A real-valued measurable function v is called a coboundary for a 
mapping 7 if there exists a real-valued measurable function w such that v(x) = 
w(x) — w(Tx) for almost every x. The difficult problem of determining which 
functions are coboundaries is important in the theory of invariant measures and of 
generalized eigenvalues. It is also necessary in understanding a key class of 
examples in ergodic theory called skew products ((4], [5]). By writing the cobound- 
ary equation in the form w(Tx) = w(x) — v(x), we can think of the w’s that solve it 
as being “nearly invariant” under T, with the corresponding v’s being the obstruc- 
tions to actual invariance. We show that when simultaneous symmetry on all three 
intervals is impossible, the obstructions to it are precisely the same as these 
obstructions to invariance, that is, coboundaries. We also find that while simulta- 
neous symmetry on two of the three intervals is always possible, these doubly 
symmetric functions are constrained to be the corresponding “nearly invariant” 
functions w. Thus, simultaneous symmetry on two of the three intervals gives a 
geometric interpretation of solutions to the coboundary equation for irrational 
rotations. 
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2. SIMULTANEOUS SYMMETRY ON THREE INTERVALS. Because differ- 
ences of scaling and translation are easily dealt with, we focus on the case 
[a, b] = [0,1]. Thus we ask the following question: 


Question. Under what conditions on the real number 0, 0 < @6< 1, and on the 
function w on |0,1], can w be simultaneously even (or odd) on the three intervals 
[0, 61, [6, 1], and [0, 1]? 


If 6 is a rational number p/gq, then we can easily construct functions with this 
property of simultaneous symmetry. We simply take any symmetric function on 
[0,1 /q] and repeatedly translate it by 1/q to cover all of [0,1]. For example, the 
function f(x) = cos 107x is even on the intervals [0, 1], [0,2/5], and [2/5, 1]. Gee 
Figure 1.) 


Figure 1. Graph of f(x) = cos 107rx. 


In fact, such examples are the only functions that are simultaneously symmetric 
on [0, 1], [0, 6], and [6, 1], and they are possible only if 6 is rational. This is shown 
in the following theorem. 


Theorem 1. Let 0 € (0,1) be given and suppose that a function w is simultaneously 
even (or simultaneously odd) on the three intervals [0,1], [0, 6], and [6,1]. Then w is 
invariant under rotation by 0. In particular, if 0 is rational and @ = p/q (in lowest 
terms), then w is the periodic extension to {0, 1] of an even (odd) function on [0, 1/q]; 
if @ is irrational, then w is a constant. 


Proof: Extend w periodically from [0,1) to R, so that w can be regarded as a 
function on the circle R/Z, with rotation represented by addition mod 1. 
If w is even on all three intervals, then w satisfies the three conditions: 


w(x) =w(1-x) for0<x <1, (1) 
w(x) =w(@-x) for0<x< 8, (2) 
w(0+x)=w(1--x) forO<x<1-8. (3) 


Combining (1) and (2) gives w(x) = w(x + 6-1) = w(x + 6) for 1 — O<x <1. 
Combining (1) and (3) gives the same equality for 0 <x < 1 — 0. This establishes 
that w is invariant under translation by 6 on [0,1]. Thus, if 6 = p/g, we see by 
iterating this equation that w is determined by translating its restriction to [0, 1/q]. 
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Combining this with (1) then shows that w is even on [0,1/q]. On the other hand, 
if @ is irrational, ergodicity of translation by @ on [0, 1] (rotation on R/Z) ensures 
that w must be constant. The same proof, with the insertion of the appropriate 
minus signs, establishes the result when w is simultaneously odd on the three 
intervals. In the latter case, if 6 is irrational, w must be 0. a 


Theorem 1 establishes the close connection between simultaneous symmetry 
and ergodicity of rotations claimed in the Introduction. Indeed, the proof works by 
showing that the simultaneously symmetric functions are precisely the functions 
that are symmetric on the whole interval and invariant under the rotation that 
takes 0 to 6. 

For our standing wave example, Theorem 1 shows that if @ is irrational, we 
cannot create standing waves with the three midpoints as nodes. On the other 
hand, if 6 = p/q, we see that linear combinations of sin 27nx, for n’s that are 
multiples of g, are odd on all three intervals, and linear combinations of cos 27nx 
are even on all three. We can get all L? functions that are simultaneously 
symmetric on all three intervals by building Fourier series out of these symmetric 
pieces. 

In the next section, we relax our requirement to symmetry on two of the three 
intervals. 


3. SIMULTANEOUS SYMMETRY ON TWO INTERVALS. In the preceding sec- 
tion, we saw that simultaneous symmetry on [0, 1], [0, 6], and [6, 1] implies a kind 
of invariance under rotation that severely restricts the type of functions that are 
possible. We will see in this section that simultaneous symmetry on two of these 
three intervals still gives a “near invariance,” and corresponding restrictions. Our 
first use of this is in the following theorem. 


Theorem 2. Let 6 be any number, rational or irrational, in (0,1), and suppose p is a 
polynomial on (0, 1] that is even (odd) on two of the three intervals [0, 1], [0, 0], and 
[0,1]. Then p is a constant polynomial. 


Proof: We treat the simultaneously even case; the argument for the odd case is 
analogous. The assumption implies that at least two of the three equations (1), (2), 
and (3) in the proof of Theorem 1 must hold for p. If (1) and (3) hold, it follows as 
in that proof that p(x) = p(x + @) for an infinite number of x’s, and hence for all 
complex numbers x. If (1) and (2) hold, then (since here we do not extend p 
periodically, and hence cannot assume that addition in the argument can be taken 
mod 1) p(x) = p(x + @ — 1) for an infinite number of x’s. In either case, p has a 
nonzero period and hence is constant. 

If (2) and (3) hold, then (2) ensures that p(x) = p(@ — x) for an infinite number 
of x’s, and hence for all complex numbers x. Therefore, p(—x) = p(@+ x), and 
combining this with (3) ensures that p(—x) = p(1 — x). This again leads to the 
conclusion that p is constant. | 


The preceding proof shows that any real-analytic function that is simultaneously 
symmetric on two of the three intervals extends to an entire function with a 
nontrivial period. This periodicity on R is similar to the invariance on R/Z 
required of a function that is simultaneously symmetric on all three intervals. Since 
trigonometric functions with irrational periods are periodic on R but not invariant 
on R/Z, this suggests that they might provide a source of examples of functions 
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that are simultaneously symmetric on two of the three intervals. We will see later 
that this is indeed the case, but first we develop some helpful machinery. 

There are three possible pairs of intervals to consider, and to distinguish them 
more clearly, we adopt a convention that 6 < 1/2, so that [0, @] is the smaller of 
the two subintervals. 

The easiest case to picture is the one in which the function w is symmetric on 
the two subintervals [0,0] and [6,1]. In this case, it is easy to see that any 
symmetric function on [0, 0] can be pasted together with any symmetric function 
on [6,1] to give a function on [0,1] of the desired type. Figure 2 illustrates an 
example of a function that is even on [0, 6] and [6,1] with @ = V2 /5 = 0.282843. 


6/2 0 1/2 + 6/2 1 


Figure 2. Graph of a function that is even on [0, 6] and [@, 1]. 


It requires some work to discover what characteristics a function that is 
symmetric on the two subintervals must have on [0,1]. We will return to this 
question in a moment. First, we show that the other two cases of simultaneous 
symmetry on two of the intervals are just scaled down versions of this one. 

Consider the case of a function that is simultaneously even (odd) on [0, 1] and 
[0,6]. Since the function is even (odd) on [0,1], it must also be so on any 
subinterval centered at 5 in particular on [0,1 — 6]. Thus, if we restrict our 
attention to the interval [0,1 — 6], we see again the first type of simultaneous 
symmetry on two intervals. That is, we think of [0,0] and [6,1 -— 0] as a 
subdivision of [0,1 — 0], and see that what we are seeking is a function that is even 
(odd) on the two pieces of the subdivision. Further, this restriction to [0,1 — 6] 
completely determines the function on [0,1] since symmetry on [0,1] forces the 
function on [1 — 0,1] to be the mirror image (or minus the mirror image) of the 
function on [0, 0]. Figure 3 is the graph of a function that is odd on [0, @] and [0, 1] 
with 6 = ¥5 /10 =~ 0.223607. 


Figure 3. Graph of a function that is odd on [0, 6] and [0, 1]. 
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The final case of simultaneous symmetry on two of the three intervals is that of 
functions that are even (odd) on the whole interval [0,1] and on the larger of the 
two pieces, [0,1]. To see that this case again reduces to a scaled down version of 
the first case, construct the following picture (for simplicity we consider the 
simultaneously even case): Draw an arrow from 0 to 6 to represent the function’s 
evolution as x goes from 0 to 6. Then, using the symmetry on [0, 1], draw an arrow 
from 1 back to 1 — @ to indicate the behavior of the function on [1 — 0,1]. Now 
use symmetry on [0,1] to draw that same arrow from @ to 20. Then use symmetry 
on [0,1] to draw it from 1 — 6 to 1 — 26. Repeat this process until no more 
multiples of 9 will fit. The function in Figure 4 is an example of such a 
construction with @ = 7/15 = 0.20944. 


1-46 0 1-3626 1/2 1-26 30 1-6 46 


Figure 4. Graph of a function that is even on [6, 1] and [0, 1]. 


The resulting picture shows the nearly periodic behavior required of any such 
function. Also, the function must be even on any subinterval where two tails or two 
heads of arrows overlap in opposite directions. These results are contained in the 
following theorem, which we state and prove for even functions; the odd analogue 
is clear. 


Theorem 3. Let @ be an irrational number in (0,1), and let q denote the greatest 
integer in 1/0. A function w is simultaneously even on [0, 1] and [@, 1] if and only if w 
is the extension via the equation w(x) = w(x + 6) of a function on \0, 6] that is 
simultaneously even on [0,1 — q@] and [1 — gO, 0]. 


Proof: As before, we first extend w periodically from [0, 1) to R. If w is simultane- 
ously even on [0, 1] and [6, 1], then we have conditions (1) and (3) as in the proof of 
Theorem 1, again yielding that w(x) = w(x + @) for x € [0,1 — 6]. Iterating this 
equation gives w(x) = w(x + qg@) for x € [0,1 — g@]. Combining this with (1) 
gives the evenness of w on [0,1 — qg@]. To establish the evenness on [1 — g@, 6], 
stop the iteration one step sooner, with w(x) = w(x + (q — 1)@) for x € [0,1 — 
(q — 1)0]. Thus w(6 — x) = w(gd — x) = w — g@+x) for x € [0,(¢ + 1) - 1]. 
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Conversely, given an x € [0,1], let k denote the greatest integer in x/0. 
Then x —k0@€ [0,0], and the hypothesis implies that w(x) = w(x — k@). If 
x —ké€[0,1 — qg@], then by symmetry on that interval we have 


w(x) =w(x—-ké) =w(1-qd-(x-k6é)) =w(1-x+(k—-q)0) =w(1-x). 


Thus also w(x + 6) =w( ~ x). The proof for the case x — k@ & [1 — g6,@] is 
similar, a 


Since we have seen that all three cases of functions even on two of the three 
intervals [0, 1], [0, @], and [@, 1] reduce to a scaled-down version of a function even 
on the two subintervals, we focus now on that case. We consider the question of 
what types of functions are simultaneously even (or simultaneously odd) on [0, 6] 
and [6,1]. One way to think about this question is to ask what the possible 
obstructions are to simultaneous symmetry on all three intervals. That is, we ask 
what the odd part of a function can be on [0,1] if that function is even on both 
[0, 6] and [9, 1]. 

The answer turns out to be related to the coboundary equation described in the 
Introduction. In our context of rotation on [0, 1], this equation takes the following 
form: we say that a measurable function v on [0, 1] is a coboundary for the number 
6 if there exists a measurable function w such that v(x) = w(x) — w(x + @) for 
almost every x € [0, 1]; addition in the argument is taken mod 1. The function w is 
called a transfer or cobounding function for v. Note that for a fixed v, any two 
transfer functions must differ by an invariant function. Because of ergodicity for an 
irrational 6, this implies that the transfer function is unique up to an additive 
constant. However, for a rational 6, the transfer functions for a fixed v can come 
in many shapes. 

We state and prove Theorem 4 for odd functions and note that it has an even 
analogue. 


Theorem 4. Fix 6 € [0,1]. If an odd function v is the odd part on {0, 1] of a function 
w that is even on both [0, 0] and [6,1], then v is a coboundary for 0 with transfer 
function w /2. Conversely, if v is a coboundary for 6, then v has a transfer function w 
that is even on both (0, 6] and [0,1] and whose odd part on (0, 1] is v /2. 


Proof: Suppose first that v is the odd part on [0, 1] of a function w that is even on 
both [0, 6] and [6,1]. Extend v periodically from [0, 1) to R. Then (2) and (3) hold 
for w, as in the proof of Theorem 1, while (1) is replaced by 

w(1-x) =w(x) -—2v(x) for0O<x <1. (1') 
Combining first (1’) and (2) and then (1’) and (3) as in the proof of Theorem 1, we 
deduce that w(x) — w(x + @) = 2v(x). 

Conversely, suppose now that the odd function v is a coboundary for 6 with 
transfer function w, so that w(x) — w(x + @) = v(x). Then, since v is odd on 
[0, 1], we have 

w(x) —w(x+ 60) = -w1-x)+wi1-x+ 8), 
and thus 
w(x) —-w(1 -x+ 0) =w(x+ 0) -w(1 —-x), 
so that the function u(x) = w(x) — w0. — x + @) is invariant under translation by 


6. Thus w = w — u/2 is also a transfer function for v. For x € [0, 6], u/2 is the 
odd part of the restriction to [0, 6] of w, implying that the odd part of wW restricted 
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to [0, 0] is 0. This proves that W is even on [0,6]. And, for x € [0,1 — 6], the 
translate of u/2 by 6, which equals u/2, is the odd part of the restriction to [@, 1] 
of w, showing that w is also even there. 

We finish the proof by noting that the odd part of w is 


W(x) -—w(i1-x) w(x) -w(x+ 0) v(x) 
ey Se 

By showing that the problem of simultaneous symmetry ‘is equivalent to the 
problem of finding coboundaries for irrational 6, we have found a simple geomet- 
ric description for that difficult problem. We have also acquired for the simultane- 


ous symmetry problem all the machinery and results developed in the coboundary 
context. The following coboundary result is particularly useful [1]. 


Theorem 5. A function v € L? is a coboundary for an irrational 6 with L? transfer 
function if and only if its sequence of Fourier coefficients, {d(n)}, satisfies 
> d(n) 


na (1 _ e?7 ind) 


2 
< 0, 


Proof: A measurable function v € L? is a coboundary with an L* transfer function 
w if and only if the elements of the sequence {w(n)}, which solve the equation 
Li(n)e?7""* = Yw(n)e27'"*(1 — e277") are the Fourier coefficients of an L? 
function; i.e., they are in /7. Solving for #(n) yields the condition in the statement 
of the theorem. a 


In particular, we can apply Theorem 5 to see that trigonometric polynomials are 
all coboundaries. For example, sin 27rx is an odd function that is a coboundary, so 
by Theorem 4 it is the odd part of a function w that is even on [0, 0] and [@, 1]. 
Further, we know that the function w is twice the transfer function of sin 27x, so 
we can solve for w using Fourier coefficients as in Theorem 5. We find that 

02 sin 2770 5 
= + —_—___— 
w(x) = sin2ax Ta cosa 9 O82 
is even on [0, 0] and [@, 1]. For example, if 6 = V3 /4 ~ 0.433013, Figure 5 shows 
the graphs of v(x) = 2sin2ax = w(x) — w(x + @) and w. 

A similar example can easily be built with cos27x to give a simple function that 

consists of standing waves constrained to have nodes at the midpoints of each of 


—) ~ = 


sin 2778 
Figure 5. Graphs of v(x) = 2 sin 2ax(--), w(x) = sin 2x + ————-~—._ cos 27x. 
1 — cos 276 
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the two subintervals resulting from an irrational subdivision. Theorem 5 shows that 
we can build more complicated functions with this property by using finite linear 
combinations of cos2akx. Infinite combinations are more problematic, with a 
tradeoff between how smooth the resulting v is and how well @ can be approxi- 
mated by rationals. (e.g., see [2].) 
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From the MONTHLY Fifty Years Ago,.. 


Computing machines, by Myron, Tribus, Lecturer in “Engincering, Univer- 
sity of California at Los Angeles, introduced vie Professor Clittord Bell. 


4 


cis 


stiggested that the: machines ‘of the fuiure thay’ "68 capable of ee as 
20, 000 :to.40,000 compilations. } per, second: (p: 511)" 
“It is truly. amazing what powers the vacuum: tube. gives: to.mankind...In 
the next few years it will: probably supply jobs’ to millions ofmen.in 
completely new industries, and will in ‘thousands of ways increase’ our | 
safety, comfort, and, material wealth, and happiness. (Dp: 436) : 
The Harvard Sequence Controlled Calculator, built by the’ International | 
Business Machine Corp. under the guidanée of Prof. ‘Howard A. Aiken, is | 
a major engineering. project. It i is ani assemblage of calculating and ‘control 
elements, mounted on racks 8 feet high and: totalling 63.fect in length, and 
weighing about 5 tons. A 4- “horsepower: motor’ furnishes its mechanical 
power through a network ‘of shafts and gears..(p. 57) 
L MontilLy 54 (1947) 


wee ee ee ee ee ee ee Cd 


528 SIMULTANEOUSLY SYMMETRIC FUNCTIONS [June—July 


Prime-Producing Quadratics 


R. A. Mollin 


Dedicated to the memory of Daniel Shanks 


1. INTRODUCTION. From the recreational mathematician to the research math- 
ematician, prime producing quadratic polynomials have held a longstanding fasci- 
nation. These polynomials have been ubiquitous in the literature for centuries, but 
quite often they appear merely as curiosities, or with explanations that are 
incomplete. This article is intended to explain the reasons behind this prime 
production to anyone from the uninitiated reader to the expert. The reasons are 
given in terms of class group structures of quadratic fields, which the reader is 
brought to understand via a development from the basics. We take the reader from 
the fundamental idea of a quadratic field, through the arithmetic of ideals therein. 
Furthermore, complete lists are given of quadratic polynomials, having negative 
discriminant, that generate consecutive, distinct primes for an initial range, and 
these lists are shown to be complete under the assumption of a suitable Riemann 
hypothesis. Attendant topics are also discussed in detail, such as the “density” of 
primes produced by such polynomials and the current record holder in that regard, 
as well as material buried at various depths throughout the literature. 


2. COMPLEX PRIME-PRODUCERS. The most celebrated of the quadratic 
prime-producing polynomials is x* — x + 41, discovered by Euler in 1772 [9]. This 
polynomial is prime for integers x = 1,2,3,...,40. Similarly, in 1798, Legendre 
[15] observed that the polynomial 


f(x) =x? +x4+41 


is prime for all integers x = 0,1,...,39. It is the latter that has become known as 
Euler’s polynomial (for example, see [24]). In any case, the prime-producing 
capacity of these polynomials has less to do with their specific form than it does 
with their discriminant,’ which is —163. We can find numerous polynomials of 
discriminant — 163 that generate consecutive prime values for at least 40 integer 
values of x simply by translating the Euler polynomial. For example, we can 
produce an infinite family of polynomials that generate forty consecutive, distinct 
prime values. Consider the following polynomial, for each n € Z, achieved from 
f(x) via x > 3x — 39 — 3n: 


g,(Xx) = 9x? — (18n + 231)x + 9n? + 231n + 1523, 


‘Recall that the discriminant of f(x) = ax? + bx + c is b? — 4ac. This is Lagrange’s notion of a 
discriminant, which differs from that of Gauss, who considered forms of the type ax” + 2bxy + cy”, and 
defined the determinant as b* — ac, which has become known as Gauss’s discriminant. See [19, 
Appendix E, pp. 347-354] for a detailed explanation of the relationship between forms and ideals, 
including clarification of problems and correction of some errors that have crept into the literature over 
the years. 
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which produces forty consecutive, distinct primes for the values: 


+n) f(38 — 3x) forx =0,1,...,12, 
n) = 
Bul x f(3x —39) forx = 13,14,...,39. 


These polynomials merely pick the prime values of f(x), spaced three apart, 
thereby reflecting the discriminant of g,(x), namely —3?- 163. For instance, if 
n = 0, theng,(x) = 9x* — 231x + 1523 is prime for x = 0,1,...,39 (the special 
case discovered by Higgins [14]). 

We can also exhibit an infinite family of polynomials that pick the prime values 
of f, spaced two apart, and generate forty consecutive, distinct primes. Consider 
the polynomial, for all n € Z, that comes from f(x) via the translation x - 2x — 
39 — 2n: 


h,(x) = 4x? — (8n + 154)x + 4n? + 154n + 1523. 


This has discriminant —2?- 163, and produces forty consecutive, distinct primes 
for the values: 


f(38 — 2x) forx =0,1,...,19, 


+ = 
h,(n + x) f(2x — 39) for x = 20,...,39. 


These polynomials just pick the prime values of f(x), spaced two apart. In 
particular, if n = 0, then h,(x) = 4x? — 154x + 1523 produces forty consecutive, 
distinct primes: h,(0) = 1523 = f(8), h,(1) = 1373 = fG6),..., h)(19) = 41 = 
f©), h,(20) = 43 = f(1), h,(21) = 53 = fG),..., Ao(39) = 1601 = fB9). 

Similarly, we can show that there is an infinite family of polynomials that 
generate eighty consecutive primes (each repeated twice) from the consecutive, 
prime values of f. Consider the following polynomial, for all n € Z, achieved from 
f(x) via x > x — 40 —n, and having discriminant — 163: 


k(x) =x* — (2n + 79)x +n? + 79n + 1601 


is prime for k,(n+x)=k,(n + 79 — x) = f(39 —x) for all x = 0,1,2,...,39. 
Thus, k, picks the values of f(39) = 1601 to f(O) = 41 in descending order, then 
repeats the values from 41 to 1601 in ascending order. For instance, if n = 0, then 
we get the polynomial discovered by Escott in 1899 [8]: 


k)(x) =x* — 79x + 1601, 


which produces eighty consecutive primes for the values x = 0,1,2,...,79. Also, 
if we let n = 1460, then we get the polynomial found by Miot in 1912 (see 
[7, p. 421): 


is prime for the eighty consecutive values: x = 1460, 1461,..., 1539. 

However, in each of these cases, the number of consecutive, distinct primes, 
forty of them, remains the same, and this is our major concern here, not the actual 
output primes themselves, which are irrelevant for our purposes. 

With the preceding examples as motivation, we consider repetitions of output 
prime values for a string of input values of a given polynomial to be cheating. By 
this we mean that the real test for prime-production comes from a quadratic 
polynomial’s ability to create a string of distinct, consecutive prime output values, 
irrespective of what those actual prime values happen to be. Therefore, with Euler’s 
polynomial as the template, we study quadratics f(x) = ax” + bx + c, with discrim- 
inant A = b? — 4ac < 0. These polynomials produce distinct primes for a string of 
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values starting with x = 0, which we call an initial string of distinct prime values. 
Therefore, we assume that c > 0. Also, motivated by g,(x), h,(x), and k,(x), we 
now formalize the notion of the maximum number of distinct prime values in an 
initial string produced by a given quadratic. 


Definition 2.1. Consider F(x) = ax? + bx +c (a,b,c €Z), a #0, and suppose 
|F(x)| is prime for all integers x = 0,1,...,7—1. If ZEN is the smallest value 
such that |F(Z)| is composite, |F(Z)|=1, or |F(Z)| =|F(x)| for some x = 0, 
1,...,4— 1, then F(x) is said to have prime - production length 7. 


For instance, the prime-production length for the polynomials k(x), g)(x), and 
f(x) = k_4)(x) is 40. 


Remark 2.1. In this section, we are not concerned about the absolute values in 
Definition 2.1 since we deal only with positive-valued polynomials F,(x). Also, 
F(x) = 1 if and only if x = 0 and A = 1, namely A = —3. However, in the next 
section, we will have to consider the possibility that the polynomials may be 
negative or 1, since we deal there with some positive discriminants. 


From the perspective of prime-production length, it suffices to look at discrimi- 
nants A = 1 (mod 4),:since the other case is trivial. To see this, assume that 
A < —4, where A = 0 (mod 4) is the discriminant of F(x) = ax* + bx +c. If c is 
even, then F(2) is even and composite (since F(0) = c must be 2, given that we are 
assuming “> 1). If c is odd, then F(1) is even and composite (since b must be 
even when A = 0 (mod 4)). Hence, prime-production length does not exceed 2 
when A = 0 (mod 4), A < —4. 

We begin with an investigation of monic polynomials. With Euler’s polynomial 
as the template, we concentrate upon the Eulerian form 


F(x) =x? +x +A, 


where A = 1 — 4A. We observe that F,(A — 1) = A’. Therefore, the prime-pro- 
duction length for F,(x) can be at most A — 1. In general, we have: 


Proposition 2.1. If 7 € N is the prime-production length of F,\(x), then 7 < A — 1, 
where A = 1 — 4A. If p is the least odd prime such that A = x* (mod p) for some 
x € Z, then ¢ < p. Furthermore, if A # —7, and 7> (A — 1)/2, then A = p. 


Proof: The first statement follows from the discussion preceding the statement of 
the proposition. Since A = x* (mod p) for some x € Z, then we may assume that 
0 <x <p, without loss of generality. If x = 0, then p|A. Thus, F,((p — 1)/2) = 
(p? — A)/4 =0 (mod p). If 7> (p — 1)/2, then (p? — A)/4 = p, that is A = 
p’ — 4p < 0. Therefore, A = —3 = —p, contradicting that 7> (p — 1)/2. Hence, 
Z<(p-1)/2. If Z=>(A-1)/2, and A#2, then by the minimality of 
p, FO) =A > p, so 47> (p — 1)/2, a contradiction. We may now assume that 
x > 0. 

By replacing x by p — x if necessary, we may assume that x < p is odd. If 
x =2n+1 for some integer n >0, then A =(2n +1)? (mod p). Therefore, 
Fn) = n*+n+( — A)/4 = (Qn + 1) — A)/4=0 (mod p). Furthermore, 
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F\(p — 1 —n) = F(n) = 0 (mod p). However, since 0 <x <p, then 0O<n< 
(p — 1)/2, so F(n) < F\(p —1-n). If p-—1-n <Z, then F(p-—1-n)= 
p = F,(n), acontradiction. Thus, p — 1 > p — 1 —n =>Z. We have established the 
second statement of the theorem. 

If “> (A - 1)/2, then 0 <n <(p-—-1)/2 < (A - 1)/2 <7 (observing that 
A = 1 (mod 4A), and that we are assuming “= 1, so A is prime). Thus, F,(n) = p. 
However, F,(n) = F,(0) = A. Therefore, either A = 2 or A =p. If Id — A)/4= 
A = 2, then A = —7. a 


With this setup and the Euler polynomial as motivation, we may now state our 
first goal: 


Goal #1: Classify all polynomials of the form x*+x+ A (x € Z) that have 
prime-production length 7= A — 1. 


By Proposition 2.1, we know that, if 7=.A — 1, then A is the least odd prime 
for which A < —7 is a quadratic residue. For example, A = 41 is the least such 
prime for A = — 163. 

Before going further, we wish to instill in the reader some further appreciation 
of the polynomial F(x) = x? + x + A, and our quest to achieve Goal #1. We may 
ask: What is the /argest number of consecutive prime values that polynomials of 
the form F,(x) can assume? The following gives evidence that the answer is 
surprising: Any number of consecutive values may be assumed. To do this, we 
need to understand something called the “prime k-tuples conjecture”. This is a 
generalization of the “twin primes conjecture” which says that p and p + 2 are 
both prime infinitely often. One might ask: Can we have p, p+ 2, and p+ 4 
simultaneously prime, infinitely often? The answer is clearly “No”, since one of 
them must be a multiple of 3. A similar argument proves that, in the sequence p, 
pt+2, p+ 6, p+ 8, p + 24, one of the values is always divisible by 5. Thus, we 
must look further for a generalization of the twin primes conjecture, since it is not 
so straightforward. 

Let R = {r,,...,r,} with r, © Z for i=1,2,...,k. If q is a prime such that 
TI_,(m + r,) = 0 (mod gq), for each n € Z with 1 <n < q, then there cannot exist 
infinitely many primes p such that {p + r}*_, are all simultaneously prime. If such 
a prime q exists, then we call R inadmissible, and otherwise we call R admissible. 
Another way of looking at this is that R is admissible if and only if, for all primes 
q, there exists an integer a, with 1 < a, < q, such that ITj_,(a, + r,) # 0 (mod q). 
One might say that, if there is no good reason why p+r,,p+/Pr,...,D +N, 
cannot all be prime infinitely often, then they should be. More precisely, we have 
the following. 


Conjecture 2.1 (The Prime k-tuples Conjecture). Jf R is an admissible set, then 
there are infinitely many integers n such that n + r is prime for each r € R. (The twin 
prime conjecture is the case R = {0, 2}). 


Remark 2.2. Dickson looked at questions concerning this conjecture as early as 
1904 [7, p. 417]. In 1923, a landmark paper of Hardy and Littlewood [12] appeared, 
in which they introduced a function based upon admissible sets, which Hensley and 
Richards [13] were able to use a half century later. Hensley and Richards showed 
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that there is a conflict between the prime k-tuples conjecture and what we call 
Conjecture A: w(x + y) < a(x) + w(y) where x,y € Z, x,y > 2, and a(x) de- 
notes the number of primes not exceeding x. At least one of these (most likely 
Conjecture A) must be false. 


We are now in a position to provide evidence that there can be arbitrarily many 
consecutive primes in an initial range that can be assumed by polynomials of the 
form x* +x + A. In other words, the prime-production length of polynomials of 
the form F,(x) is unbounded. 


Theorem 2.1.” Jf Conjecture 2.1 holds, then for any positive integer B, there exists a 
quadratic polynomial of the form F,(x) = x* + x +A, such that F,(x) is prime for all 
integers x with 0 <x < B. 


Proof: Let r, = j? + j for j = 0, 1,2,. , B. 
Claim. The set {r;}? 9 is admissible. 

If gq = 2, then ‘let a, = 1. Since each r; is even, then TT o(7; + 1) is odd. For 
each odd prime gq, let b, = 1 (mod 4) be a quadratic nonresidue modulo q (which 
a Chinese remainder theorem allows us to select) and set a, = (1 — b,)/4. If 

TT7(r; + a 2) = 0 (mod q), then for some j with 0 <j < B,r, +a = 0 (mod gq). 
In other words, r, = —a, (mod q). Therefore, (2j + 1)? = 47, 4+1=1- 4a, =b, 
(mod q), a contradiction. This establishes the Claim. 

Conjecture 2.1 ensures that there exist arbitrarily large values of A for which 
{r, + A}? are primes. For such an A, F(x) =x*+x+A is prime for x = 
0, 1,2,..., B. a 


Q 


Theorem 2.1 gives us the (theoretical) hope of finding a quadratic polynomial of 
the form F,(x) =x* +x +A that generates not only more primes than Euler’s 
polynomial, but also as many as we like! Yet on the practical side, no polynomial 
of the form F,(x) with prime-production length more than forty has yet been 
found. Recent efforts by Lukes, Patterson, and Williams [18] have shown that, if 
such an F,(x) exists, then A > 10’°. This shows the dichotomy between theory and 
practice, in this regard. Furthermore, Proposition 2.1 tells us that the length of 
prime-production for F,(x) is bounded by A — 1. In view of Theorem 2.1, the 
reader may think that this is a contradiction since Theorem 2.1 seems to tell us 
that the prime-production length is unbounded. However, the point is simply that 
the B in Theorem 2.1 must be less than A — 1. Therefore, although the prime- 
production length for a fixed polynomial is bounded by A — 1, the number of 
primes that can be taken on by such polynomials is unbounded. For instance, the 
work in [18] shows that if we want a prime-production length of B = 41, then the 


* Theorem 2.1 was communicated to this author in a letter from Andrew Granville dated February 2, 
1989. He gave permission for it to be included in Louboutin, Mollin and Williams’ work [17], where the 
reader will find more details, and motivation arising from a search for prime-producing quadratic 
polynomials, when A > 0. Also, A. Balog credits Granville with discovering a corollary of his main 
result in [1]. This corollary says that (unconditionally) there exist infinitely many polynomials of degree 
k having prime values at 2k + 1 consecutive integers. 
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value of A is bigger than 108. All that Proposition 2.1 says is that B = 41 must be 
less than A — 1. Since A > 10”, this is no barrier to finding such a polynomial. 

We now focus our attention upon finding those F,(x) that are prime for 
x =0,1,..., A — 2. Questions concerning prime-producing quadratic polynomials 
become interesting only if we look at irreducible polynomials. (Recall that an 
irreducible polynomial over @ is a polynomial f(x) for which there are no 
factorizations f(x) = g(x)h(x) into polynomials g(x) and h(x) of positive degree 
with coefficients in the rational field Q.) Even more, we should also assume that 
there is no prime p that divides F,(x) for all x € Z since, for example, x* + x + 4 
is irreducible but always even.? 

The criterion that allows us to achieve our Goal #1 is called the Rabinowitsch 
criterion, proved in 1913. To understand what the Rabinowitsch criterion says, we 
must now introduce the notion of a class number. To do this, we must first 
understand what a ring of integers is, since the class number is a measure of 
unique factorization therein. Independent of polynomials, we may define a discrim- 
inant as follows. If D # 1 is a square-free integer, and 


A= 4D if D#1 (mod 4), 
D _ otherwise, 


then A is called a fundamental discriminant or field discriminant, since we may 
form K = Q(VD) from the adjunction of a root of the irreducible polynomial 
x’ — D to Q. This K is called a quadratic field with discriminant A and associated 
radicand D; K is called a real quadratic field when A> 0, and K is called a 
complex quadratic field when A < 0. A complex number is an algebraic integer if it 
is the root of a monic polynomial with coefficients in Z (which we call the rational 
integers to distinguish them from higher order algebraic integers). If f is a monic 
polynomial over Z of least degree having an algebraic integer a@ as a root, then f is 
irreducible over Q. This unique polynomial is called the minimal polynomial of a 
over @, and so the previous statement is equivalent to saying that the minimal 
polynomial of a over @ has coefficients in Z. Moreover, the set of all algebraic 
integers in the complex field C is a ring, which we denote by A. Finally, 
A  K = G defines the ring of integers of the quadratic field K of discriminant A. 

Now we introduce the basic notion of an ideal. A subset J of G is an ideal 
therein if J is closed under addition and also multiplication from @. In other 
words, a + B € J whenever a, B € J, and ay € J whenever a €/ and yE XQ. 

We say that two G-ideals J and J are equivalent if there exist nonzero 
a, B € G such that (a)J = (B)J. This equivalence relation partitions the @-ideals 
I into disjoint ideal classes {7}, which form a finite abelian group &, called the 
class group of @, (or simply of K). The order (cardinality) of &, denoted h,, is 
called the class number of K. 

Now we are ready to state the Rabinowitsch criterion, which tells us when 
h, = 1 for A < 0. In other words, it tells us how the prime-producing capacity of 
F(x) for A < 0 is intimately linked to the solution of Gauss’s class number one 
problem for complex quadratic fields. We use |x] to denote the greatest integer 
less than or equal to x, namely the “floor” of x. 


3Tn [4], Bouniakowsky conjectured that, if a polynomial p(x) € Z[x] is irreducible and N = 
gcd({ p(x): x € Z}), then p(x)/N takes on prime values for infinitely many x € Z. 
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Theorem 2.2 (Rabinowitsch’s Criterion [23]).4 Let A <0 be a discriminant with 
A =1 (mod 4). Then F(x) =x? +x+(1 —A)/4 is prime for all x € Z with 
0 <x <||Al/4 — 1] if and only if h, = 1. (We call ||A|/4 — 1] the Rabinowitsch 
bound.) 


We see that Euler’s polynomial k_,)(x) =x? + x + 41 fits quite nicely into the 
criterion. In point of fact, it is the last one to do so, as the following solution of the 
class number one problem for complex quadratic fields shows. This was solved 
independently by Baker and Stark, anticipated by Heegner (see [19, Chapter 4, 
pp. 105-128)). 


Theorem 2.3 (Gauss’s Class Number One Problem for A < 0). 1, = 1 if and only 
if —A € (3, 4, 7, 8, 11, 19, 43, 67, 163}. 


Proof: See (6, Theorem 7.30, p. 149]. | 


Remark 2.3. It is not very well-known that Gauss’s class number one problem was 
solved by Landau in 1903. Although Theorem 2.3 is known as the solution of 
Gauss’s class number one problem, this is not entirely accurate. The problem is 
essentially one of interpretation. Gauss’s discriminant as defined in footnote 1 
makes A = b* — ac even. Thus, Landau proved that h, > 1 for A = b? — ac < —7, 
a much simpler problem than the one solved in Theorem 2.3. 


Theorems 2.2—2.3 show us that F(x) =x*+x+A cannot be consecutively 
prime for x = 0,1,2,..., A — 2 when A > 41. The Euler polynomial tops the list 
with a prime-production length of 40. Thus, although Theorem 2.1 tells us that, 
given any integer B > 0, there exists a polynomial x* + x + A that is prime for all 
nonnegative integers x < B, Theorems 2.2—2.3. establish that B cannot be A — 2 
unless A < 41. This points to the special nature of Euler’s polynomial as the 
optimal such prime-producer. We have therefore achieved Goal #1. 


Remark 2.4. The reader should understand that, one day, we may discover a 
polynomial F,(x) with A> 0 having prime-production length bigger than 40, 
probably much bigger, if we believe Theorem 2.1. However, A will be enormous 
relative to the prime-production length. The search for even one such polynomial 
with prime-production length 41 has already shown, via the work in [18], that A 
must be bigger than 10”. 


Now we turn to the non-monic case for prime-producing quadratics. Legendre 
knew that 2x* + 29 is prime for x = 0,1,...,28 [7, p. 420]. Also, Lévy [16] 


“A two-line proof of this fact for a more general case is in [19, Theorem 4.1.2, p. 108]. An interesting 
anecdote about Rabinowitsch comes from Mordell [21]: “In 1923 I attended a meeting of the American 
Mathematical Society held at Vassar College in New York State. Someone called Rainich from the 
University of Michigan at Ann Arbor, gave a talk upon the class number of quadratic fields, a subject in 
which I was very much interested. I noticed that he made no reference to a rather pretty paper written 
by Rabinowitz from Odessa and published in Crelle’s journal. I commented upon this. He blushed and 
stammered and said, “I am Rabinowitz”. He had moved to the U.S.A. and changed his name...”. 
Thanks go to Alf van der Poorten for making me aware of Mordell’s paper. The spelling of 
Rabinowitsch used in this article coincides with that in Crelle [23]. 
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discovered in 1914 that 3x? + 3x + 23 is prime for x = 0,1,...,21. More recently, 
Van der Pol and Speziali [26] observed in 1951 that 6x* + 6x + 31 is prime for 
x = 0,1,...,28. Why do these non-monic polynomials generate these initial strings 
of primes? Are there more of them with even larger prime-production lengths? To 
answer these questions, we first observe that these polynomials share a common 
shape and prime-production length: 2x? + 29 = gx* — A/(4q), 7= ||Al/(4q)] = 
29, where gq =2 and A = —232; 3x* + 3x + 23 = qx” + qx + (q’ — A)/(4q), 
¢ = ||A\/(4q)] = 22, where g=3 and A = —267; and 6x’ + 6x + 31 = 
gx* + qx + (q? — A)/(4q), Z= ||Al/(4q)] = 29, where g = 6 and A = — 708. This 
motivates the following. 


Definition 2.2. Jf A is a fundamental discriminant, and q = 1 is a square-free divisor 
of A, then we call 


qx’ — A/(4q) if 4qlA, 


F = 
a,q(*) qx> + qx + (q’? — A)/(4q) otherwise, 


the qth Euler-Rabinowitsch polynomial. (Note that, for q = 1, and A odd, Fy, (x) is 
the Eulerian form that began our discussion.) 

We also let F(A, q) denote the maximum number of (not necessarily distinct) 
primes dividing F, (x) for any rational integer x with 0 < x < ||A|/(4q) — 1]. We call 
L|AI/(4q) — 1], the qth Rabinowitsch bound. (Note that, for q = 1, this is the 
Rabinowitsch bound in Theorem 2.2.) 


The three non-monic prime-producers, found by Legendre, Lévy, and Van der 
Pol and Speziali have the same shape, and they each have prime-production 
lengths 7 = ||A|/(4q)]. These three quadratics also share another feature: 2x” + 29 
has discriminant —232 = —2°-29 (n = 2 distinct prime divisors) and h_, = 
2 = 2"-1; 3x? + 3x + 23 has discriminant —267 = —3- 89 (n = 2 distinct prime 
divisors) and h_,. = 2 =2""'; and 6x* + 6x + 31 has discriminant —708 = 
27- 3-59 (n = 3 distinct prime divisors) and h_+. = 27 = 2”~'. This motivates 
the statement of our second goal. 


Goal #2: Let A < 0 be a fundamental discriminant having N + 1 distinct prime 
factors, and let q be the product of all of them, excluding the largest one. Find all 
polynomials Fy ,(x) such that F(A, q) = 1 and hy = 2”. In other words, find all 
Fy ,(x) that are prime for x = 0,1,...,LIA|/(4q) — 1] with hy = 2%. 


Goal #2 seeks those F, ,(x) with prime-production length 7> ||A|/(4q)| and 
h, = 2%. If q = 1, then the solution to Goal #2 for A = 1 (mod 4) is the solution 
to Goal #1, since the only values for which 7= ||A|/4] = A — 12> 1, and h, = 1 
are those A = 1 (mod 4) (A < —3) in Theorem 2.3. If g = 1 and A = 0 (mod 4), 
then A = —8, for which “= 2, is the only possibility for 7> ||A|/4] => 1, given 
the discussion preceding Proposition 2.1. Therefore, we concentrate upon the 
(non-monic) case where g > 1. Not surprisingly, the achievement of Goal #2 
requires a generalization of Rabinowitsch’s criterion. In order to do this, we must 
go back to Gauss, whose classical result we state in modern terminology. 


Theorem 2.4. Suppose A <0 is a fundamental discriminant and A has N + 1 
distinct prime factors. Then hy = 2% if and only if every element of @, has order 1 
or 2. 


Proof: See [19, Theorem 1.3.3, p. 16]. | 
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When every element of @ has order 2, we say that it has exponent 2, denoted 
é€, = 2. In general, the exponent is the smallest positive integer e, such that 
{7}°4 = 1 for all {7} © &, where 1 denotes the trivial ideal class {4}. If every 
element has order 1, then h, = e, = 1, so & = 1. Therefore, we may restate 
Gauss’s result as follows: 


Theorem 2.5 (Theorem 2.4 restated: Gauss’s Exponent 2 Theorem). Suppose A < 0 
is a fundamental discriminant and A has N + 1 distinct prime factors. Then hy = 
2 if and only if e, < 2. 


Remark 2.5. For those readers who prefer the theory of binary quadratic forms, 
Gauss’s result says that, when A < 0, there is one class per genus in the form class 
group of discriminant A if and only if this form class group has exponent less than 
or equal to 2. Euler was interested in such discriminants, which he called 
convenient numbers or numeri idonei. Gauss listed sixty-five such numbers in [11, 
§303]. Today we know that this list is complete under the assumption of the 
generalized Riemann hypothesis, see [19, pp. 172-186]. Finally, for those who 
prefer class-field theory, an ideal class group has e, < 2 if and only if the Hilbert 
class field is equal to the genus field [6, p. 127 ffl. 


Is the Rabinowitsch criterion merely an isolated curiosity? Is there some deeper 
underlying phenomenon? To answer these questions, we developed the following 
result, which not only generalizes Theorem 2.2, but also incorporates Theorem 2.5. 
The values of A = —3, —4 are excluded in the following for trivial technical 
reasons. 


Theorem 2.6 (Mollin [20]). Let A < —4 be a fundamental discriminant that has 
N +1 distinct prime factors, with p being the largest, and suppose q>=1 is a 
square-free divisor of A, having M = 0 distinct prime factors all less than p. Then 
e, <2 if and only if F(A, gq) =N+1— Mand hy = 2° OM 1, 


Thus, the Rabinowitsch criterion, an application of Theorem 2.6 for M = 0, 
may be restated as follows. 


Corollary 2.1. Let A < —4 be a fundamental discriminant. Then hy, = 1 if and only 
if F(A, 1) = 1. 


We also get the more recent class number 2 phenomenon by R. Sasaki (see 
[19, p. 112]) another application of Theorem 2.6 for M = 0. 


Corollary 2.2. Let A < 0 be a fundamental discriminant. Then h, = 2 if and only if 
F(A, 1) = 2. 


Corollaries 2.1-2.2 are actually very special because, if e, < 2, then h, < 2 if 
and only if F(A, 1) = hy. Therefore, the fields for which A < 0 and hy < 2 are 
uniquely characterized by these corollaries. 


Remark 2.6. In the language of binary quadratic forms, the conclusion of Theorem 
2.6 says: There is one class per genus in the ideal class group of discriminant A if 
and only if F(A,qg) =N+1-—M and hy = 274, 9tM— 1 


1997] PRIME-PRODUCING QUADRATICS 537 


Theorem 2.5 tells us that e, < 2 if and only if h, = 2%. Thus, one direction in 
Theorem 2.6 is clear. Let q be the product of all prime factors of A excluding the 
largest (namely, when M = N). What is new and revealing is that h, = 2" implies 
F(A, q) = 1. In particular, we have proved the striking result that F(A, q) = 1 
when e, < 2, that is Fy ,(x) is always prime up to the qth Rabinowitsch bound. 
This links e, < 2 to the prime-production length of F, ,(x). Thus, Theorem 2.6 
shows that the Rabinowitsch criterion underlies a deeper phenomenon: a criterion 
for e, < 2 in terms of the factorization of Fy ,(x). This is enough information to 
achieve Goal #2. Theorem 2.3 identifies those A < 0 for which e, = hy = 1. From 
the work of Weinberger [27], we know an upper bound on the number of 
discriminants A < 0 for which e, = 2, under the assumption of the generalized 
Riemann hypothesis. Therefore, a computer search can provide us with a complete 
list of them. 


Tables: The tables are broken down into congruences modulo 4 of the radicand 
D. In each of them, q is the product of all distinct primes dividing the discriminant 
A, excluding the largest prime p. Moreover, / is the prime-production length of 
F, ,(x), and hy is the class number. Finally, concerning Remark 2.5, the reader 
will notice that the number of discriminants in Theorem 2.3 and Tables 1-3 is 
sixty-five. Thirty-five of these values are on Gauss’s list of convenient numbers. 
These are the fundamental discriminants A for which either A = 0 (mod 4) or 
A = —15. The balance of the values in Gauss’s list are discriminants arising from a 
more general situation (See [19, pp. 112-128]). Finally, the reader should note that 
¢ = ||A\/(4q)], in both Theorem 2.3 and in Tables 1-2, and ||A|/(44q)] +3 =>Z= 
L|A|/(4q)] in Table 3. 

In Table 1, 6x? + 6x + 31 is the optimal prime-producer. In Table 2, the 
optimal prime-producer is 2x” + 29. The optimal prime-producer in Table 3 is 
3x? + 3x + 23. Thus, the optimal prime-producers are the polynomials that moti- 
vated our quest to achieve Goal #2! Therefore, these ubiquitous examples in the 
literature, which have appeared as isolated curiosities with no complete and 


TABLE 1. D = 3 (mod 4) 


2x27 +2x+3 
2x7 +2x4+7 
6x? +6x+5 
6x? + 6x+7 
2x27 +2x+ 19 
6x? + 6x + 11 
10x* + 10x + 11 
6x7 + 6x +17 
30x? + 30x + 11 
14x2 + 14x + 13 
30x? + 30x + 13 
6x? + 6x + 31 
22x? + 22x+17 
42x? + 42x +17 
30x? + 30x + 19 
A2x2 + 42x + 19 
70x? + 70x + 23 
210x? + 210x + 59 


ped 


ped 


me KO 
NM OR HDR ONO WNW OO WNW AW bo 


ped 


2 
2 
4 
4 
2 
4 
4 
4 
8 
4 
8 
4 
4 
8 
8 
8 
8 
16 


Fy ,(x) is prime for all nonnegative integers x </ 
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TABLE 2. D =2(mod 4) 


10x? + 19 
30x72 +7 

30x* + 11 
42x? + 11 


oomopfpPfP HA NFPA NN NN 


Fy uy is prime for all nonnegative integers 
x< 


TABLE 3. D = 1 (mod 4) 


3x7 + 3x +2 
5x? + 5x +3 
3x7 + 3x4+5 
Tx? + 7x +5 
5x7 +5x4+7 
3x7 + 3x4 11 
11x? + 11x 4+7 
15x27 + 15x +7 
5x7 + 5x4+ 13 
3x? + 3x + 23 
13x? + 13x +11 
Tx? + 7x +17 
15x? + 15x + 11 
21x? + 21x + 11 
15x? + 15x + 13 
35x? + 35x + 13 
33x? + 33x + 13 
55x72 + 55x 417 
15x? + 15x +17 
105x? + 105x + 29 
35x? + 35x + 19 
105x? + 105x + 31 
231x7 + 231x + 61 
195x? + 195x + 53 


pod 
Wn On ff BN 


Ne 
ND bh 


— — 
FBT AWA NFO WN ~I] 


2 
2 
2 
2 
2 
2 
2 
4 
2 
2 
2 
2 
4 
4 
4 
4 
4 
4 
4 
8 
4 
8 
8 
8 


F, ,() is prime for all nonnegative integers x <# 


detailed explanation, are now fully understood as the optimal prime-producers 
with discriminants A < 0 having e, = 2. Thus, Tables 1-3, via Theorem 2.6, give 
us the complete list and Goal #2 is achieved, assuming the validity of the 
generalized Riemann hypothesis, of course. 


Remark 2.7. Once we leave the case of exponent 2, it is still possible to have 


F(A, q) = N + 1 — M. However, when e, > 2, the other condition in Theorem 2.6 
fails, namely h, # 2° +™—!. The following example illustrates these assertions. 
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Let A = —9867 = —3-11-13-23 with g = 429=3-11-13, and M= 
N = 3. Then, F(A, g) =N+1—M =1, since Fy (x) = 429x* + 429x + 113 is 
prime for all nonnegative integers x < ||A|/(4q) — 1] = 5. However, h, = 2* and 
4+ F(A,q)+M-—1=3. Also, e, = 4, since @ is the product of two cyclic 
groups of order 2 and one of order 4. This example also illustrates that the two 
conditions in Theorem 2.6 are sharply necessary and sufficient conditions for 
exponent 2 to occur. 


Before closing this section, it is worth revealing to the reader a criterion for 
€, <2 in terms of the form of A < 0. To do this, we must understand a little about 
the arithmetic of ideals in K = Q(yD). 

A non-unit ideal J (namely, J # G) is a prime ideal if, whenever I divides a 
product of ideals I,J, in @, then J|I, or I|I,; this mimics the property of a prime 
p © Z, of course. Here “divides” means if I|J, then there exists an @-ideal H 
such that J = HI. It follows that J Cc I. Conversely, if J CJ, then there is an ideal 
HT such that J = HI. Thus, ideals that “divide” are those that contain. Therefore, J 
is a prime ideal if and only if it contains some non-trivial factor of any product it 
divides. From now on, when we say that a prime ideal F is above p, we mean that 
F divides (p). Furthermore, we call p non-inert if (A/p) # —1, where (* /p) is the 
Kronecker symbol. 


Remark 2.8. Recall that the Kronecker symbol is defined as follows. If gcd(p, A) 
= | for a discriminant A, then the Kronecker symbol for p > 2 is just the 
Legendre symbol namely (A/p) = 1 if A is a quadratic residue modulo p, and 
(A/p) = —1 otherwise. If p|A, then (A/p) = 0; if p = 2, then (A/2) = 1if A =1 
(mod 8), and (A/2) = —1 if A = 5 (mod 8). See [19, Exercise 1.1.4(b), p. 8]. 


If p is non-inert, then there are two cases. Either (A/p) = 1, in which case we 
say that p splits in K, since (p) =P,F, with A, #F,; or (A/p) = 0, in which 
case we Say that p ramifies since p|A and (p) = #* for a unique prime G-ideal F 
over p. 


Theorem 2.7. Suppose A < 0 is a fundamental discriminant. Then e, < 2 if and only 


if, for every split prime p < ¥ — A/3, there exists a square-free divisor q > p of \Al| 
such that A = q* — 4pq. 


Proof: [20, Theorem 3.1, pp. 22-23]. a 
We close this section with an illustration of Theorem 2.7 from Table 3. 


Example 2.1. If A = — 3315, then the only split primes p < ¥ — A/3 are p = 29, 31. 
Therefore, A = 657 — 4-29-65 = 85* — 4-31-85. 


3. DENSITY OF PRIMES. In Section 1, we were concerned with quadratic 
polynomials producing consecutive primes for an initial string of input values. 
Therein, we solved the problem for all gth Euler-Rabinowitsch polynomials, 
assuming the generalized Riemann hypothesis. It is natural to ask how many 
primes are produced by a given quadratic polynomial up to a given bound. For 
instance, motivated by Euler’s polynomial and its generalizations, we would like to 
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know the number P,(n) of primes assumed by F(x) =x*+x+A for x= 
0,1,2,...,, with A = 1-— 4A <0. We showed in Section 1 that P,(A — 2) = 
A-1 if and only if A e€ (1, 2,3,5, 11,17, 41}, in other words for 
A € {-3, —7, —11, —19, — 43, —67, — 163}. Before it was known that 4 = 41 is 
the largest value for which this phenomenon occurs, Dick Lehmer, in 1936, 
attempted to find a larger such value of A. First he observed that A must be odd 
in order for F,(x) to be prime, so — A = 3 (mod 8). From the discussion leading up 
to Proposition 2.1, we see that F,(x) can be divisible only by non-inert primes (also 
see [19, Lemmas 4.1.2—4.1.3, p. 118]). Lehmer exploited this fact by observing that, 
since a prime g with (A/q) = —1 cannot divide F,(x), then F,(x) would be a 
prime a larger percentage of the time for a A < 0 when (A/q) = —1 for “enough” 
small primes g. Lehmer showed that the least positive integer Nig = 3 (mod 4) 
such that (—Nigo/gq) = —1 for all primes g < 109 must satisfy Nig, > 5-107. 
For P,(A — 2)=A-—1, Lehmer proved that -A =44 —1>Ni, so A= 
(1 — A)/4 > 1.25-10°. This told Lehmer that the existence of A> 41 with 
P,(A — 2) =A — 1 was highly unlikely. Thus, he thought that it might be more 
fruitful to search for an A with P,(40) = 41. If such an A exists, we have noted in 
Section 1 that A > 10”. 

In 1939, Beeger published his discovery of F\(x) =x? +x +A, for A = 19421, 
27941, 72491, and A = —77683, — 111763, — 289963 [2]. He found these values by 
computing all positive integers N with N = 3 (mod 8) and N < 10° and such that 
(—N/q) = —1 for all odd primes q < 43. The only such numbers are N = 77683, 
111763, 289963. In 1939, Poletti [22] computed tables of values of P,(n) for these 
A and found that P,5,,(11000) = 4819, P,,,9,(11000) = 4923 and P,,(11000) = 
4605, so Beeger’s polynomial is better at prime-production than Euler’s polyno- 
mial. What is behind all of this is Hardy and Littlewood’s “Conjecture F”’ from 
1923 [121]: 


Conjecture 3.1. Jf A = 1 — 4A is not a perfect square, then 


P,(n) ~ C(A) +n/log (n), 
where 


C(A)= [I G-(4/)/(p - 1). 


primes p=3 


Remark 3.1. Conjecture F is more general than this, but we have boiled it down to 
suit our purposes. In order to accommodate an error term, with which we are not 
concerned here, n/log(n) is replaced by 2/(' dx/log F,(x) (see [19, pp. 145-147] 
for more details). Also, recall that f(n) ~ g(n) means lim, _,., f(1)/g(n) = 1, and 
log (n) means the natural logarithm to the base e. 


For instance, we know from the work of Shanks? [25] that C(—163) = 
3.319773177471 for A = 41, and C(—289963) = 3.694708051836 for A = 72491, 
confirming Beeger’s observation. Thus, Conjecture 3.1 allows us to find examples 
of F,(x) with a high asymptotic density of prime values. How C(A) is computed and 
extensions of Beeger’s ideas to find new F,(x) with high asymptotic density of 


>The death of Daniel Shanks, at the age of seventy-nine on September 9, 1996, is a great loss to the 
world of mathematics. This author was fortunate enough to have known him and enjoyed his rapier wit, 
as well as the depth of his intellect. A pioneer and a giant in the world of computational number theory 
has left us. 
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primes, can be found in [18]. Therein, the authors report results obtained with the 
new number sieve MSSU, developed at the University of Manitoba. This acronym 
means The Manitoba Scalable Sieve Unit which employs Very Large Scale Integration 
(VLSD circuits designed by C. Patterson at the University of Calgary (on whose 
Ph.D. committee this author sat). The raw aggregate sieving rate of the MSSU is 
more than 6.4 billion integers per second. 

In 1990, Fung and Williams [10] found C(A) = 5.0870883 for A = 
— 531497118115723 with A = 132874279528931, and P,(10°) = 312975. Compare 
this with P,,(10°) = 261080. In this example, they found that (A/q) = —1 for all 
odd primes g < 179. In [18], the authors used the MSSU to search for all 
|A| < 10” with A = 5 (mod 8) and (A/q) = —1 for all odd primes g < 200. The 
largest value of C(A) that they found is 


C(A) = 5.3384021 
for 
A = —6849319464662435083. 
Therefore, assuming Conjecture 3.1, F(x) =x? +x +A with 
A = 1712329866165608771 


has the largest asymptotic density of primes for any polynomial of this type to date. 

Other authors have looked at less demanding problems involving prime-produc- 
ing quadratics. For example, Boston and Greenwood [3] wanted to find a quadratic 
polynomial f(x) that represents the most distinct primes for 0 <x < 99. Of 
course, we have observed earlier that we can get longer strings if we allow 
repetitions than if we impose the restriction of distinctness. In [3], it is observed 
that x* — 69x + 1231 represents ninety-five primes for x = 0,...,99, although not 
distinct. However, this is just the special case k_;(x) of our family k,(x) = 
x* — (Qn + 79)x +n? + 79n + 1601 introduced in Section 1. Moreover, we can 
improve upon their result slightly via k,(x). One can verify that k,(x) produces 
one hundred primes for n <x <n + 104, with the first eighty being consecutive, 
and twenty of the last twenty-five being primes. This provides an infinite family of 
polynomials that generate one hundred primes (eighty of which are consecutive) in 
intervals of one hundred and five. 

The polynomial in [3] that generates the most distinct values is A1x* + 
33x — 43321, with A = 7105733. This has prime values for ninety of the input 
values 0 <x < 99. However, the most consecutive primes in this sequence is 
twenty-six, so the problem described in Section 1, which is the real test, is not 
addressed. The challenge left at the end of [3] (where the authors cite C(A) for 
earlier efforts of Fung and Williams but not the latest in [18]) is to find an example 
that gives forty-nine or even fifty consecutive, distinct primes. Theorem 2.1 says 
one should be able to do so. 

In looking at class number one problems for real quadratic fields, the Fung 
polynomials 


gi(x) = 47x? — 2247x + 21647 
and 
g(x) = 103x* — 3945x + 34381, 
and the Ruby polynomial 
£3(x) = 36x” — 810x + 2753 
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were found. It turns out that g,(x) and g.,(x) have prime-production lengths 
f= 43, and g,(x) has prime-production length 7= 45. (For g,(x), C(A) = 
3.7006266, where A = 979373; for g,, C(A) = 3.9727065 where A = 1398053; and 
for g3(x), C(A) = 3.9099833, where A = 259668, see [18, p. 118].) To date, these 
are the largest known consecutive, distinct, quadratic prime-producers, namely 
those with the largest prime-production lengths. The polynomials g,(x) were 
spinoffs from the investigations involving a search for h, = 1 when A > Q, by this 
author and Hugh Williams (see [19, Chapter 4, pp. 129-145]). The discovery of g, 
and g, by Gilbert Fung, then a graduate student of Hugh Williams, was an- 
nounced by this author in a lecture given during the Western Number Theory 
Conference at Las Vegas in 1988. Russell Ruby was in the audience, and he went 
home to check these polynomials. Later he found g;3. 
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The Poincaré-Miranda Theorem 


Wladyslaw Kulpa 


Bernard Bolzano (1781-1848), the outstanding Czech thinker, philosopher and 
mathematician, proved that if a function f, continuous in a closed interval [a, b] 
changes signs at the endpoints; f(a)- f(b) < 0, then this function equals zero at 
some point of the interval. In 1883-1884, Henri Poincaré announced the following 
result without proof [9], [10] (in Browder’s translation [3)]): 


“Let fi,...,f, be nm continuous functions of n variables x,,...,x,: the 
variable x; is subjected to vary between the limits +a; and —a,. Let us 
suppose that for x; = a,, f; is constantly positive, and that for x, = —a,, f, is 


constantly negative; I say there will exist, a system of values of x where all 
the f’s vanish.” 


In 1886 Poincaré [11] published his argument on the homotopy invariance of the 
index, which is a basis for the proof. The result obtained by Poincaré has come to 
be known as the theorem of Miranda [8], who in 1940 showed that it was 
equivalent to the Brouwer fixed point theorem. The Poincaré theorem was 
implicitly rediscovered in 1911 by Brouwer [2] who proved that 


‘Under a continuous map of the unit cube into itself which displaces every 
point less than half a unit, the image has an interior point.” 


The Brouwer fixed point theorem for n = 3 was proved by him in 1909; an 
equivalent result was established earlier by Bohl [2] in 1904. It was Hadamard [4] 
who in 1910 gave (using the Kronecker index) the first proof for arbitrary n. In 
1912 Brouwer gave another proof using the simplicial approximation technique, 
and notions of degree. A short and simple proof of the Bohl-Brouwer theorem was 
given in 1929 by Knaster-Kuratowski-Mazurkiewicz [7]; the proof was based on the 
lemma of Sperner [12]. In this paper, we also apply combinatorial methods of 
proof taken from the Sperner lemma, but our proof seems to be simpler because it 
does not require the notion of barycentric coordinates and barycentric subdivision. 

Let us begin by discussing the Poincaré-Miranda Theorem. Let k > 1 be a given 
natural number and let Z, := {i/k:i € Z}, where Z denotes the set of integers. 
Let Z; denote the Cartesian product of n copies of the set Z;,: 


Zi = {z:{1,...,n} —>Z, | zisamap} 


Using the Cartesian notation let 0 := (0,...,0) be the neutral element and let 
e, = (0,...,0,1/k,0,...,0), e,@) = 1/k , be the i-th basic vector. Denote by P(n) 
the set of permutations of the set {1,..., n}. 


Definition 1. An ordered set S =[Z,...,z,] C ZZ is said to be an n-simplex if 
there exists a permutation a € P(n) such that 


24 = 29 + €acty> £4 = 21 + Cg ayoeees Zn = 2) -4 + Can): 
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Any subset [2 ,..., 2;-15 Zj41.-++9 2] CS, i=0,...,n, is said to be the 
(n — 1)-face of the n-simplex S. A subset C C Z? of the form 


c= c(k) = {0,+,... 44) 
~~ (K) = 40, Goons ke 7 


is said to be a combinatorial n-cube. Define the i-th combinatorial back and front 
faces of C as 


Cr = C)(k) = {z €C:z(i) = 0}, Cl = CF(k) = {2 € C:z(i) = 1}, 
and the boundary as 


aC = U {cp UCt:i=1,...,n} 
Definition 2. Let S = [Z,...,z,] © ZZ be an n-simplex. Then for each point 
z; € S there exists exactly one n-simplex T = S[i] such that 
S C) T= {Zo5-++> 21> Ziato++ +9 Zn} > 
We shall define the i-neighbour S[i] of the simplex S (see Figure 1) as 
(a) If0 <i<n, then Sli] -=[2,..., 2,1, %;, Zj41)--+> Z,], where x; = z;_, + 
(2341 — 2) = 2-1 + Cait): 


(b) If i = 0, then S[0] := [z,,..., z,, Xo], where x) =z, + (z, — 2p). 
(c) If i=~n, then S[n] = [x,, Zp,...,Z,-1], where x, = Z) + (z,_, — Z,). 


Figure 1 


We leave it to the reader to prove that the n-simplexes Sli] are well-defined and 
that they are the only possible i-neighbours of the n-simplex S. 
From Definition 2 the following observation is immediate: 


Observation. Any (nm — 1)-face of an n-simplex contained in the combinatorial 
n-cube C is an (n — 1)-face of exactly one or two n-simplexes from C, depending 
on whether or not it lies on the boundary 0C. 
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Let J” := [0,1]" be the n-dimensional cube of the Euclidean space R” and let 
oI” be its boundary. For each i < n let us denote 


I -={xel": x(i)=0}, Fo={xel": x(i) = I}, 


L 


the i-th opposite faces. 


Poincaré-Miranda Theorem. Let f: I” —> R", f=(f,,...,f,), be a continuous 
map such that for each i <n, f,U;) C(—~,0] and f,U;*) <[0, +). Then there 
exists a point c € I” such that f(c) = 0. 


Proof: For each i = 1,...,n define H; =f; '(—~,0] and H;* = f/'[0,%). Since 
for each sequence of n-simplexes S, C C(k), diameter S, —>0 as k —>~, in 
order to prove the theorem it suffices to show that for each k there exists an 
n-simplex S, C C(k) such that 


H; 0S, #0#H; OS, foreachi=1,...,n. (1) 
Indeed, using a compactness argument we infer that the intersection 
H= (){H; 0Aj: i=1,...,n} 


is not empty. It is clear that f(c) = 0 for each c € H. 
Define a map g: I” — {0,...,n} by 


J 
p(x) = mar | E net}, (2) 
i=0 

where Fy =I" and F;) =H; \ J, for each i=1,...,n. Since I°C H?, where 
é= + or —, the map ¢ has the following properties: 


ifx <7; , then o(x) <i, andif x €J;, then p(x) #i — 1. (3) 
From (3) it follows that for each subset S c J” 
e(S O17) = {0,...,n — 1} implies thati =n and e= —. (4) 
Observe that (2) and the fact that J” = H; U H;* imply that 
if p(x) =i-—1and g(y) =i, then x © H; andy €H;. (5) 


Let us call a finite subset S of / + 1 points in the combinatorial cube C = C(k) 
admissible if o(S) = {0,..., J}. From (1) and (5) it follows that the theorem will be 
proved if we show that for each k there exists an admissible n-simplex S Cc C(x). 
We shall actually prove that for each k the number a(C(k)) of admissible 
n-simplexes is odd. 

Our proof is by induction on the dimension n of C. The assertion is obvious for 
n = 0, because C = {0}, p(0) =O and a = 1. 

According to (4) any admissible (n — 1)-face s < dC lies in C,(k) and by our 
induction hypothesis the number a(C,(k)) of such faces is odd. Let a(S) denote 
the number of admissible (n — 1) — faces of an n-simplex S$ CC. 

If § is an admissible n-simplex, clearly a(S) = 1; while if S is not an admissible 
n-simplex, we have a(S) = 2 or a(S) = 0 according as g(S) = {0,...,m — 1} or 
{0,...,2 — 1} \ g(S) #@. 
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Hence 
a(C(k)) = y) a(S), mod 2. (6) 


On the other hand, an admissible (nm — 1)-face is counted exactly once or twice in 
a(S) according as it is in the boundary of C or not. Accordingly 


Y a(S) = a(C, (k)), mod 2, (7) 
hence 

a(C;(k)) = @(C(k)), mod 2. (8) 
But a(C_(k)) is odd. Thus a(C(k)) is odd, too. = 


Remark. The assumption that for each i<n, f,U;) C(—@,0] and f,U;*) c 
[0, +), can be replaced by the Bolzano condition 


f(a) -f,(b) <0, foreachi<nanda€l;,bel;. 


To verify this, let us assume that f,(J?) # {0} for each facet J,°. Bolzano’s 
condition implies that {f,7;>) <c(—~,0] and f,U;7) c[0, +)}, or {ffU>) ¢ 
[0, +) and f,U777) c(—~,O0]}. Let us put B= —f, if (f,;7) c[0, +0) and 
fd) <(—~, 0}} and F. := f, if not. Applying the Poincaré-Miranda Theorem to 
the map F := (F,,..., F,), there is a point c € J” such that F(c) = 0. It is clear 
that f(c) = 0, too. If f,(77) = {0}, then repeating this reasoning for the (n — 1)- 
dimensional cube J? we get a point c € I? for which f(c) = 0. 


Coincidence Theorem. [f maps g,h: I” —> I” are continuous and if hU;) c I> 
and hU;") cI; foreachi = 1,...,n, then there exists a point c such that g(c) = h(c). 


Proof: Let us put f(x) := h(x) — g(x). The map f satisfies the assumptions of the 
Poincaré-Miranda ‘Theorem and therefore there is a point c € J” such that 
f(c) = 0. But this means that g(c) = h(c). a 


Maps g and h that satisfy the conclusion of the Coincidence Theorem are said 
to have the coincidence property. If h is the identity map, we get 


Bohl-Brouwer Fixed Point Theorem. Any continuous map g: I" —> I" has a fixed 
point. 


Applying the Coincidence Theorem to constant maps g(x) = a,a € 1", gives 
the 


Corollary. A continuous map h: I" — I” is onto ifhU;) CI; andhU;) cI; for 
eachi = 1,...,n. 


Borsuk Non-Retraction Theorem. Let f: X —— R” be a continuous map from a 
compact set X C R". If f(x) =x for eachx € dX, then X Cf(X). 
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Proof: Let J” be an n-dimensional cube such that XY U f(X) CJ”. Extend the 
map f to a continuous map h: J” —> J” such that h(x) = x for each x €© J" \X. 
It is clear that h(J;) CJ; and h(J;") CJ; for each i. The preceding corollary 
ensures that J” C h(J”), and hence X c fCX). 


Cube-Squeezing Theorem. Let h: I” —> X be a continuous map onto a metric 
space. If h(oI") = X, then for some i = 1,...,n the images of the i-th opposite faces 
have non-empty intersection, that is, hU; ) \ hU;*) # 0. 


Proof: Set A, = hU;) and B,: = hU;") for each i = 1,...,. Suppose that A; M 
B,=9 _ each i. Since X is a normal space, there exists a continuous map g: X 
— R", g =(g,,...,g,) such that g,(A;) = {0} and g,(B,) = {1} for each i. It is 
clear that g(A,) CI; and g(B;,) CJ;, which implies that g(X) C oJ”. Let us put 


f:=goeh. The map f: I” —> dl” satisfies the assumptions of the preceding 
corollary and therefore fU/") = I”, which is a contradiction. a 


When n = 3, the theorem says that it is not possible to make a drawing of the 
cube J° in the plane so that disjoint faces of /° are disjoint in the drawing; see 
Figure 2. This remark leads to the conclusion that the Cube-Squeezing Theorem 
has something in common with dimension theory. 


Figure 2 


We shall show that the Cube-Squeezing Theorem holds whenever the set 
hI") \ h(al”) is “small”. 


Non-Squeezing Theorem. Let h: I" — Y be a continuous map onto a metric space 
Y such that hU,,) A hU;*) = @ for eachi = 1,...,n. Then there exists a closed subset 
A CYN h(dI") such that dim A > n. 


Proof: Let g: h(aI") —> oI” be a continuous map of the kind described in the 
proof of the Cube-Squeezing Theorem and suppose that dim A <n for each 
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closed subset A C Y \ A(oJ"). Then the map g has a continuous extension g;: 
Y — oI” (see [5, Chapter VI]). Considering the composition of the map h with 
the map g, gives a contradiction with the Corollary. = 


The Non-Squeezing Theorem may be thought of as a kind of Brouwer domain 
invariance theorem, because in the case when Y = R”, dim A =n if and only if 
the set A has non-empty interior in the space R”. 
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Lecturing at the “Bored” 


Melanie Wahlberg 


Even a superficial review of the literature on collegiate mathematics education 
reform reveals a common thread: the imperative nature of the college mathematics 
student’s active participation in the learning of mathematics. Active participation 
in the classroom may take the form of working cooperatively with classmates in 
small groups, spending time in the computer lab using class-related software, 
presenting problems or concepts on the board to classmates, or other activities 
germane to the subject besides passively listening to a lecture presented by the 
instructor. There is a wide range of lectures styles, but the style to which I refer is 
one characterized by little expectation of student engagement besides perhaps 
following the logic of the lesson. Because active learning in the mathematics 
classroom is advocated by several professional groups ((1], [4], [5], [6], [8]), it is 
certainly reasonable for mathematics instructors to reflect on their rationale for 
maintaining a lecture style of teaching, even part of the time. During the 1995-1996 
academic year, that is exactly what I did. 

I was teaching a section of reformed calculus, with a class size of forty. I had 
planned on extensive use of cooperative groupwork, and had accordingly divided 
the class into ten groups of four people each. I attempted to make the groups 
heterogeneous (in gender and ability) by collecting quantitative measures of the 
students’ previous mathematical performance. Several times during the first two 
weeks of class, I set aside time for group work. However, due to the physical layout 
of the classroom (five rows of auditorium seating with twelve seats per row and an 
aisle up the middle) my students were having a difficult time actually communicat- 
ing in their groups. Classtime initially devoted to guided discovery in the groups 
degenerated into work time for forty individuals, certainly not the vision I had 
foreseen when planning earlier that summer. To restructure this time, I tried to 
view my entire class as a large cooperative group, with me as the group leader. 
Instead of breaking the class into four-person groups as before, I would present 
examples of concepts we were currently studying, and pose open-ended questions 
to the class. For example, while developing a geometrical approach to Newton’s 
method for finding the roots of an equation, I asked my students which solution 
they thought the graphical process would yield when the initial guess for the 
x-value was directly between two of the roots. Was there a pattern that allowed us 
to predict the general result? Another time wheh discussing inverse functions, we 
did an impromptu in-class exploration of which linear functions, and later, which 
general functions, are their own inverses. Such discussions involved much conjec- 
turing, arguing, and scratchpaper seatwork, individually and in pairs, among the 
students, and were opportunities for algebraic, geometric, and numerical explo- 
ration. 

Because of my class’s general enthusiasm and willingness to participate, If felt 
we had developed a classroom atmosphere free of criticism, and open to conjectur- 
ing. The students often presented problems for each other, furthering their own 
mathematical communication. Although I was somewhat disappointed not to be 
able to watch my students develop cooperative skills, I felt that we had made the 
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best of the situation, and reached a workable compromise: modified traditional 
lecturing in a nontraditional atmosphere. 

This arrangement continued in my class for about eight weeks until we came to 
the point where the students were ready to see the Fundamental Theorem of 
Calculus, the culmination of much study of differentiation and some study of 
integration, principally from a conceptual viewpoint. Primarily because I was so 
enchanted with the textbook’s intuitive approach (we were using [2], produced by 
the Consortium based at Harvard), I decided to use the last fifteen minutes of class 
to present the proof of the theorem, instead of the usual timeshare between 
examples and discussion. After all, we had certainly developed the necessary 
machinery, and I thought exposure to the underpinnings of this grand relationship 
would benefit all of us. My thoughtful reflection and development of the lecture 
would certainly make the connections clear to everyone. 

Proving the Fundamental Theorem of Calculus to my class was one of the 
highlights of teaching that semester. This was to be the only theorem whose proof I 
presented, but I felt its phenomenal usefulness merited a little extra attention. I 
derived a great deal of satisfaction from my presentation of the proof, and was 
quite happy with the support of the textbook as well. The class was relatively quiet 
during that period, but I felt the students just must have understood and appreci- 
ated the lecture. The examples and problems the students had worked (or at least 
seen) were an almost perfect framework in which to nestle this new theorem, and I 
myself looked forward to working the homework problems that required its 
application. 

The next day I was surprised by the unusually large number of students who 
appeared during my office hour. My small office began to overflow, so we moved 
to a classroom with a blackboard to better accommodate ourselves. It appeared 
that the students were having a hard time even starting the homework problems I 
had assigned for practice, let alone knowing when to apply their grand new 
theorem. Together, my students and I worked through three or four of the 
problems, appealing frequently to the theorem. At one point, one of my students 
cried out, “I’m not sure I even believe that relationship ... how would you prove 
something like that?” The other students then began talking among themselves: 
“The proof is that thing she did yesterday at the end of class”, and “That was so 
confusing!” and worst of all, “Were you listening to what she was talking about? I 
was staring out the window the whole time!” I felt a strong sense of disappoint- 
ment. After all, I had enjoyed giving the presentation so much! At this point, I 
seriously began to consider my reasons for standing up at the board. 

The last straw came a couple of weeks later, when I gave my class a midsemester 
course evaluation, to be completed anonymously. My students were handing in 
high quality work, and class participation was greater than in any class I had taught 
before, but I was interested in their impressions. One student wrote, “I wish we 
had more time for discussion among ourselves, instead of watching the instructor 
up at the board.” (sic) This very telling slip on my student’s part, combined with the 
earlier feedback, made me ask myself if I could ever stand up at the “bored” 
without my students mentally shutting down. 

Looking for an answer to this question forced me to ask myself another: ““What 
is it that keeps me at the board, telling my students about mathematics, with 
seamless lectures and well-chosen examples?” For an additional perspective, I 
posed this question to a colleague of mine. He and I quickly generated a 
chalkboard full of reasons why we continue to lecture, even after frequent and 
positive experiences with alternative forms of teaching. Upon further reflection, I 
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discovered that the reasons we listed fell into three categories: reasons lecturing 
seems to benefit the instructor directly, reasons the instructor thinks lecturing 
benefits the students, and reasons students prefer the instructor to lecture; these 
three areas were not disjoint. I organized and sorted the list, and went on to 
respond mentally to each reason we had listed. Following is an organized list of 
these reasons for lecturing and “reflections” responding to each reason. To 
preserve the spirit in which these ideas developed two voices (a point and a 
counterpoint) will be heard throughout this discussion. The first voice reflects my 
initial tendencies, and, based on my own mathematics classroom training and 
experience, advocates the lecture method. The second voice counters the first, 
suggesting ways of maintaining mathematical standards while attempting to involve 
students more actively in the learning process. 


REASONS THAT APPEAR TO BENEFIT THE INSTRUCTOR DIRECTLY. First, 
lecturing gives me a feeling of control. If I stand up at the board day after day and 
“present” or “cover” the material, I know exactly what my students have seen. I am 
quite able to keep up with any department-mandated syllabus, and can expose my 
students to exactly the techniques, examples, and ideas I want them to think about. 
The pace might not be quite right for each student, but I am confident that the 
middle third is reasonably challenged, and I am more than willing to interrupt my 
monologue to answer questions. 

A second reason I continue to lecture is that is assuages any guilt I might feel 
after using alternative methods for a period of time. What if they are not really 
“getting it” in their cooperative groups, or their interactive sessions in the com- 
puter lab? If I tell them the rules, and show them the examples, the responsibility 
for their learning somehow transfers to their shoulders. Now that I’ve said it, it is 
fair game on homework and exams. If my students aren’t able to demonstrate 
mastery of the material when I assess them, the burden lies on them. They must 
not have been listening when I presented this concept? 

My third reason for lecturing is more personal than the first two. I get 
tremendous satisfaction from preparing and presenting a well-laid out elegant set 
of accessible ideas, each of which builds on the one before. After all, this is 
material I love, and a favorite way to share it is to lay it out neatly for students. In 
addition, each time I prepare a topic to be presented for my students’ eyes I gain 
additional mathematical insight. It is very easy for me to generate my own 
enthusiasm and come back to class day after day when I am “presenting.” 

My last reasons really falls into all three categories: I am comfortable at the 
board, and my students are comfortable (perhaps too much so) when I am at the 
board as well. It is more challenging for all of us when I have them work in groups 
or present problems for each other. 


REFLECTIONS: Let’s first consider the idea of control. While it is true that by 
lecturing, I control the material being presented, I certainly have no control over 
what the students choose to listen to, and even less over what they choose to study. 
If instead I employ alternative methods of teaching, such as cooperative group- 
work, I really am not relinquishing any valuable control, and instead gain the 
possibility of more actively engaging the students in the learning process. On one 
hand, it is certainly possible for a student to be an active participant in the 
mathematical process of listening to a lecture. He or she may pose questions, 
anticipate examples or counterexamples of concepts being presented, or make 
educated guesses about the next step of the proof. However, my experience with 
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randomly calling on students during a lecture would imply that such engagement is 
not the norm. Cooperative groupwork offers instructors an alternative classroom 
modality with a chance of greater student participation; it is much more difficult to 
“hide” in a group of four than in a classroom of forty. 

I might let go of my second concern, namely the guilt associated with letting 
students pose and verify their own conjectures, when I realize that the lecture 
method ensures the coverage of, but not necessarily the learning of material at 
hand. As long as the group activities (lab work, projects, etc.) are designed to 
include the students’ reflecting on what it is they have discovered, coupled with 
timely feedback from me, I may feel just as confident the material has been duly 
“covered”, and perhaps with greater student understanding, due to their participa- 
tion. 

An example of an activity specifically designed to determine whether or not 
students were making their own connections outside of class was an expository 
paper requiring students to reconcile the formal definition of the derivative of a 
function at a point with their personal conceptual understanding. I provided only a 
few guidelines: students were to use graphical, numerical, and algebraic ap- 
proaches in their explanations and were to enumerate at least two distinct 
applications of the derivative of a function at a point. Their audience was a 
(fictitious) fellow student who had missed the pertinent days of class. 

Completing this writing assignment provided a basis of understanding that many 
of my students referred to repeatedly through both semesters of the calculus 
sequence. The struggle to gather, refine, and articulate their thoughts encouraged 
them to expand what Vinner [9] terms the student’s concept image (that is, the 
nonverbal entity, correct or note, that the student associates with the concept 
name) until it became compatible with the concept definition of a derivative at a 
point. Thus, I did not spend class time trying to deliver a lecture tailored to fill in 
the gaps of my students’ individual and collective understanding of derivative. 
Instead, I developed a task that both enabled students to make their own 
connections and provided a window into the understanding each had constructed 
of algebraic, geometric and numerical relationships. 

In addressing the third reason for lecturing, I must first admit that the 
satisfaction I get from preparing a lecture is undeniable, and stems from at least 
two components: assembling my mathematical ideas into a coherent structure, and 
the resulting deeper understanding. However, while preparing for a class session 
that more actively involves a greater portion of the students, I may experience both 
of these components by changing my expectations for the class period, and devising 
supportive materials accordingly. For example, Monk and Finkel [3] suggest 
putting the creative energy usually reserved for planning lectures into constructing 
a sequence of probing, converging questions that elicit critical student thinking. 
These questions might then be answered by the students after they have studied a 
mathematical concept; their answers can provide material for group discussion. 

Finally, while the comfort of the well-known lecture method is indisputable, it is 
a comfort that may breed complacency among the students and in the instructor. 
Thus it is not necessarily a benefit. Employing alternative methods of teaching 
certainly forces instructors and students outside their “comfort zones.” However, 
Reynolds et al. [7] say as long as the instructor is “...an enthusiastic advocate of 
working cooperatively ...,” students tend to adapt to cooperative groupwork to the 
point that the class structure does not interfere unduly with their learning. 


REASONS FOR LECTURING THAT APPEAR TO BENEFIT THE STUDENTS. 
These reasons are strictly from an instructor’s point of view, the student’s perspec- 
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tive is discussed later. One driving force behind the lecture method has been that 
it provides quick exposure to a body of knowledge to many people at one time. 
With it, I might summarize years of investigative work in a fifty-minute session. 
Thus I can show my students a larger amount of material in one semester. 

The second benefit for students goes hand-in-hand with the preceding one: a 
well-laid out lecture gives students, particularly upperclassmen, exposure to the 
potential beauty and elegance of mathematics, and provides a framework in which 
to place the problems and proofs they are working. When students have access to 
such a lecture, they might get as much pleasure from listening as I do from telling. 

Finally, in this category is the fact that lecturing demonstrates to students at all 
levels one way in which to convey mathematical ideas, both verbally and symboli- 
cally. Day after day, students can listen and watch a person speak and write 
mathematics. There is an unstated assumption that students are not only learning 
the mathematical content being presented, but are somehow internalizing the style 
in which a mathematician communicates mathematics. 


REFLECTIONS: I will rebut all three reasons of this category simultaneously. 
The weakness of each argument (from the standpoint of my calculus class) is the 
same: the goals are very lofty, but they are not being realized. Lecturing would be 
a remarkable pedagogical tool if students were able to internalize “years of 
investigative work,” appreciate the “potential beauty and elegance of mathematics,” 
and learn to “convey mathematical ideas, both verbally and symbolically,” after 
listening to the teacher standing at the board. However, my calculus class demon- 
strated through their inability to connect with the Fundamental Theorem of 
Calculus (and in other instances, such as applying the chain rule, or carrying out 
numerical algorithms) that this often is not the case. 


REASONS THAT STUDENTS PREFER AN INSTRUCTOR TO LECTURE. As 
much as students complain of the tedium associated with listening to an instructor 
for a significant portion of the class period, there still are reasons students may 
prefer this format to another. The first is a natural consequence of the climate 
established in the typical lecture-format classroom: when the teacher presents 
exactly the material that is needed for successful completion of the course, there is 
no need for the students to read the textbook. For the students, the textbook is 
then an example (template) reference for homework problems, and a source for 
teacher-assigned exercises. A corollary to this is that when the teacher exclusively 
employs the lecture method, it sends the very clear message to the students that 
whatever is important will be covered in class, so there is no need to sort out for 
oneself the major concepts from the supportive material. In particular, an instruc- 
tor in this setting would be breaking an unwritten rule to test students on an idea 
not explicitly covered in lecture. 

Finally, students like an instructor to lecture for the simple reason that it is 
much more restful, both physically and mentally, than the more active group effort 
required to solve a problem or individual effort required to present a problem on 
the board. Depending on the classroom dynamic, students could go through the 
whole semester without speaking in their math class if they never ventured to ask a 
question. 


REFLECTIONS: The reasons from the students’ perspective are probably the 


easiest category to counter. If students receive the message that the prose of the 
text is superfluous (don’t read it), and further that the teacher will implicitly make 
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plain what should be studied (what is on the test), they cannot be expected to 
develop mathematical autonomy. Similarly, for many students, the role of listening 
to a lecture, no matter how conceptually challenging, does not imply the develop- 
ment of ownership of the mathematical ideas being presented. 


SUMMARY. Based on the collective experiences of this calculus class and on 
other courses I have taken and taught, I claim that the classroom with lecture as 
the only mode for instruction is obsolete. Lecture tends to minimize the opportu- 
nity for engagement of the learner with mathematics. Fortunately, making a 
transition from a classroom atmosphere having exclusively the traditional lecture 
format to one characterized by a blend of lecturing and alternative learning 
methods need not imply a compromise. I have argued that having students present 
problems to the whole class, for instance, does not lower academic standards but 
rather demands and develops a higher level of mathematical communication 
Instructors employing cooperative groupwork do not relinquish control over the 
curriculum; they instead recreate the atmosphere in which students learn the 
mathematics of a curriculum determined by the instructor. Thus, instructors may 
simultaneously maintain mathematical standards and engage a greater fraction of 
their students with the material, an activity not explicitly demanded in a classroom 
where the instructor is the sole resource for knowledge. Lecturing should be 
limited to situations where students are at least partially engaged due to the 
climate the instructor has developed, and should be used to introduce students to 
some basic concepts as a springboard for their students’ own discoveries. This 
extends the power of the lecture, allowing the instructor to maximize its potential 
and to minimize lecturing at the bored. 
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NOTES 


Edited by Jimmie D. Lawson 


A Generalization of Wolstenholme’s 
Theorem 


M. Bayat 


In this article we prove a generalization of Wolstenholme’s theorem [2] with a 
simpler proof using group theory and number theory. 


Theorem 1. (Wolstenholme). If p > 3 is prime, then the numerator of the fraction 
L+ stent — is divisible by p?. 


Proof: First observe that 


1 1 1 
1+—-+>4+°+ 
2 3 p-1 
; 1 1 1 1 
=|1+ +/—+ foe $ + 
p-1 2 p-2 p-1 pri 
2 2 
1 1 1 
*\1(p—1) ° 2(p —2) p-1)\(p*1 
2 2 
A 
= D> 
(p- 1)! 
where 
— 1)! — 1)! — 1)! 
ge hPa aD TD 
W(p-1) 2%(p-2) [P= )(2) 
2 2 


In the following it is shown.that A is divisible by p. Consider now the multiplica- 
tive Z,-group consisting of all non-zero congruence classes mod p. We choose F in 
the Z,-group corresponding to (p — 1)!/i(p — i), ie., 

(p—1)! 
i( p — i) 


r 


(mod p) [1 < 2") 
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Thus 


—]1 
ri(p—i) =(p-—1)! (mod p) [1 <i < ~~ | 
According to Wilson’s theorem [1], (p — 1)!= —1 (mod p), and so 


p-1 
5 


—ri? = —1 (mod p) fee 


In Zp this means that i2 is the inverse of 7: 


Since 


_ —1 
i? =(p—i) i<is? , 
2 
every square of Z, arises as a square from only half the elements of Z,, 


{1,2,..., —} Furthermore, the squaring map is injective on this set. Since the 


inverse map is bijective on the square elements, every 7 is equal to the square of 


~ = =—1 
one of the numbers, 1,2,..., -—; hence 


Awverens (24) [P= )(2=)(2) =0 (mod p) 


2 2 2 


and this completes the proof of theorem. 


1 
Theorem 2. If p > 3, then the numerator of 1+ => + °°: + is divisible 


2 
, 2" (p-1) 
y Pp. 
Proof: The numerator of the fraction is 
—(ep- YP (p-1)P (p- 1)? 
A= + a te 
(p — 1) 


First of all we calculate (p — 1)!’/i? for 1 <i<p-—1. Let i be arbitrary and 
1 <i <p —1. There exists 7 in Z, such that 


— 1)? 
PT 3 ) (mod p) (1<i<p—1), 


and thus 
ri? =1 (mod p). 


In other words, i? is the inverse of 7 in Z,. Indeed 
. 2 
F=(i') =(p-7') (1sixp-1). 
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Therefore, 7 is one of the following: 


PR (Po). 


Hence 


2 
peaen4(2*)) 2 (mod p) 
2 } I~ PB) 


which gives pla. 


Theorem 3. Let p be prime and k € N such that 2k < p — 1. Then 
(i) the numerator of the fraction 


1 
1+ sp ote tooo 
7 2k=1 (p - 1)*# i 
is divisible by p?. 
Gi) The numerator of the fraction 
1+ + 
2k 
ae (p- 1) 
is divisible by p. 
Proof: (i) The numerator of the fraction is 
(p-1)PE1 (pipe (p - 1) Pe 
A= y2k=1 7 2k=1 oor (p - 1k 
To each element of A we can associate an element of 7 € Z,. In fact, 
(p-1yrr" | 
r= rs (mod p) (1<i<p-—1l) 
satisfies ri7*-! = -1 (mod p). This means that in Z,, 7 = (—i*"')"} 
(1 <i <p — 1), and, since the inverse is unique, —7 is one of the 27*~'-powers of 
1,2,..., (p — 1). Hence, the numerator of the fraction is 
= -(1+2'+--+(p—1)"") (mod p) (1) 


In order to calculate the right-hand side of (1), we use Euler’s formula [1] 


m 1 
ie a —~ 1)" = ee "| m+1-r 
(" Loo") P, 


where the 8, are Bernoulli’s numbers. Indeed B, = 1, B, = — 3, Box+, = 0, and 
B., = (—1)*"'B, for k > 1, where the B, are the coefficients of x?* in the series 
expansion 


2k-1 _ 
Pe-1 4 92-14 4p - 12k! -y al" |B, 
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and so 


2k (17k! 4 D2k-1 4. ++ +(p _— 1)*"~*) = > r 


2k-1 OK Ik] 
=0 7 


|p, 


The right-hand side is divisible by p* since B,,_, = 0, and hence 
pRk(2*1 + 22k 1 + +(p — 1)""). 

Since 2k < p — 1, this implies p?|.A, and this completes the proof of (i).The proof 
of (ii) is similar to (i). 
Theorem 4. Let p be a prime and let k © N such that 2k < p — 1. Then 

(i) The numerator of the fraction XP_7' 1/i?*~', [G, p) = 1] is divisible by p"*?. 

(ii) The numerator of the fraction XP-;' 1/i?*, (Gi, p) = 1] is divisible by p". 
Proof: (i) According to Theorem 3 the numerator of each of the following numbers 
is divisible by p*: 

1 1 1 


Ai = quer + gama + + er 
1 1 1 
1 1 1 


A proof similar to the one given in Theorem 3 shows that the numerator of each of 
these fractions is equivalent to 17*~' + 27*-! + --» +(p — 1)?*~! (mod p). Since 
there are p”~‘' of these fractions we have 


pe-d(PEL 4 1 te $(p - 1)*"~") =0 (mod p*»), 


and the numerator of ye" A, is divisible by p"*'. This completes the proof of (i). 
A similar argument proves (ii). 
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A Note on the Mean Value Theorem 
for Integrals 


Zhang Bao-lin 


The purpose of this note is to extend a result by Bernard Jacobson [1] concerning 
the mean value theorem for integrals. 


Theorem 1. Jf the function f is continuous on the interval [a,b], then there is a 
number c such that a <c < band 


[ f(x) de = flo)(b - a). (1) 


If x € (a, b) then by Theorem 1 applied to the interval [a, x], it is possible to 
choose a number c, (a < c, <x) as a function of x on (a, b) such that 


[f(0) dt = fle,)(x- a). (2) 


Jacobson studied the behavior of c, as x approaches a, and proved the following 
result: 


Theorem 2. Suppose the function f is continuous on the interval [a,b] and is 
differentiable at a with f'(a) # 0. If c, is given in the mean value formula (2) for 
integrals, then 


i C,-a 1 3 
goa k-a 2) (3) 
We consider the case f’(a) = 0 in Theorem 2. 


Theorem 3. Suppose the function f is continuous on the interval [a,b] and is 
twice-differentiable at a with f'(a) = 0, f"(a) # 0. If c, is given in the mean value 
formula (2) for integrals, then 


Cy 7a 1 
tim aa a 
Proof: Consider Taylor’s expansion 
f(t) = f(a) + F(a) a) + €(t)(t — a)’ (5) 


where €(t) > 0 as t > a. Integrate (5) from a to x, and obtain 


(x 


i f(t) dt = f(a)(x — a) + f"(a)-——, — + +f" e(t)(t—a) dt. (6) 
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On the other hand, equation (5) evaluated at c, (a < c, < b) yields 


Fle) = F(a) + f(a) + ¥(e,.)(e, a)", (7) 
where y(c,) > 0 as c, > a. Thus 
Flex — a) = f(a)(x—a) + F(a) 2 (x a) 
+ ¥(c,)(c, — a)"(x — a). (8) 


From (2), (6), and (8) we have 
f"(a)(x — a)’ + 6 f(t) t —a) dt 
= f"(a)3(c, — a)"(x a) + 6y(c,)(c,-—a)(x-a). (9) 


It is easy to see that 
* 2 
fee adt eo, )(ey a” 
lim _____._ =0, lim —“'* = 0, 
xa (x — a) xa (x — a) 
From the condition f”(a) # 0 and from (9) we obtain 


, C,-a\* 1 
tim (—} 30 


In a similar way one can establish the following more general result. 


Theorem 4. Suppose the function f is continuous on the interval [a, b], and is k times 
differentiable at a with f(a) = 0 G = 1,2,...,k — D, f(a # 0. If c, is given in 
the mean value formula (2) for integrals, then 


c,—a 1 


lim =F . 
xr-a X —-a k+1 
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UNSOLVED PROBLEMS 
Edited by Richard Nowakowski 


In this department the MONTHLY presents easily stated unsolved problems dealing with 
notions ordinarily encountered in undergraduate mathematics. Each problem should be 
accompanied by relevant references (if any are known to the author) and by a brief 


description of known partial or related results. Typescripts should be sent to Richard 
Nowakowski, Department of Mathematics & Statistics & Computing Science, Dalhousie 
University, Halifax NS, Canada B3H 3J5, rjin@cs.dal.ca 


When Is There a Latin Power Set? 


J. Dénes 


In [6] we find that the images of 0123 under the permutations (0)(123), (1)(032), 
(2013), (3021) and those under their squares (0)(132), (1)(023), (2)(031), (3)(012) 
respectively form the Latin squares 


0 2 3 1 0 3 1 2 
_3 1 0 2 2_2 1 3 O 
“1320 %73 021 
2 0 1 3 12 0 3 
In general, a set of n permutations ap, a,,...,a@,_, onaset S of n symbols form 


a Latin square in this way just if they are sharply transitive, 1.e., if, for all i = S and 
O<j<k<n-1, afi) #a,(i). It may happen that the set of squares 


ag, @7,...,a°_, are also sharply transitive, so that they too form a Latin square 
L?, which is necessarily orthogonal to L. If each set of k-th powers a}, a*,..., aK, 
for 1 < k < mis sharply transitive, then the associated Latin squares L, L’,..., L” 


form a Latin power set, which is a mutually orthogonal set; see [4] or [3, p. 231]. 


Conjecture 1 [4]. If n > 4 and # 6, then there exists a Latin power set containing 
two Latin squares of order n. 


If true, this would give another refutation of Euler’s 1782 conjecture that no 
pairs of orthogonal Latin squares of order n exist for n = 2 mod 4. Euler’s 
conjecture was, of course, disproved in [2]; see also [3, §11.2] and it is now known 
that for n # 2 or 6 there exists a pair of orthogonal Latin squares of order n. 
Conjecture 1 is known to be true for 7 <n < 50 and for all larger n except 
possibly those of shape 6k + 2; see [4] for details. 

Keedwell ((5] and see [3, §7.1]) used a special set of sharply transitive permuta- 
tions to obtain orthogonal Latin squares, which suggests the following 


Conjecture 2. For every even n, n > 8, there exists a set of permutations each of 


which fixes one symbol while the other n — 1 symbols form a single cycle, such 
that the associated L and L’ are Latin squares. 
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Our opening example is of this type. Conjecture 2 is true when n is a prime 
power q. In this case take a, = (0X1, 0, 0*,..., 07-7) where o is a primitive 
root of the Galois field GF[q] and 


41 = (o')(1 +o',ata',...,002 + o') 


for 0 <i <q — 2 (see [8]). More generally, we can construct a Latin power set 
with as many as h members if there is a group G of order n that is R,-sequence- 
able. 

We say that a group (G,+ ) is R-sequenceable (R,-sequenceable) if its elements 
ay = 0, a,,...,a,-, can be ordered in such a way that the partial sums by) = dy, 
b, =a) + a,, Db, =a + a, + ay,...,b,_, =Ay +a, +--+ +a,_, are all different 
and b,_, =a, +a,+°:: +a,_, =b) =0. This means that b, — by, b, - 
b,,...,b, 5 —b,_3, by — b,_, are all distinct and so coincide with the non-iden- 
tity elements of G. If further, for each j,j = 2,3,...,h the set of differences 
b;,; — 5; 1 = 0,1,...,n — 2 also are all distinct, where suffixes are taken modulo 
n — 1, we say that (G, +) is R,-sequenceable. 

To construct a power set with h members we take a, = (c)\(bo, bj,..., b,_>) 
where the b,; are the partial sums for the R,-sequencing and c is the element that 
does not occur among the partial sums, and 


a,=(ct a;)( bo + a;,b, + 4,,...,b,-2 + a;) 


for each group element aj, 1 <j <n -— 1. Then the permutations 
ax, at,...,a*_,, form a sharply transitive set for k = 1,2,..., and so we can 
construct a Latin power set with 4 members. Note that the Galois field construc- 
tion is a special case since the additive group of GF[q] is R,_,-sequenceable (see 
[7]). Note also that the Latin squares constructed by this method, i.e., the method 
of Conjecture 2, are all idempotent. It has been shown in [5], see also [3, §7.4], that 
if the requirement of idempotency is dropped then an R,-sequenceable group of 
order n permits construction of at least h + 1 pairwise orthogonal Latin squares. 

For the first open case, n = 10, the construction using an R,-sequencing does 
not work, because neither the cyclic nor the dihedral group of any singly-even 
order is R-sequenceable [7] (nor, a fortiori, is it R,-sequenceable). However, the 
existence of a set of permutations of the type postulated in Conjecture 2 is 
equivalent to the existence of a 2-fold perfect resolvable (10, 9, 1)-Mendelsohn 
design [8, p. 86] and the possible existence of a suitable “irregular” (non-cyclically 
generated) design is not ruled out. Another quite different construction of a set of 
h, but not of h + 1, pairwise orthogonal idempotent Latin squares from a set of 
permutations of the above kind, or from any resolvable h-fold perfect (n, h + 1, 1)- 
Mendelsohn design, is given in [1]. 


REFERENCES 


1. F.E. Bennett, E. Mendelsohn and N. S. Mendelsohn, Resolvable perfect cyclic designs, J. Combin. 
Theory Ser. A 29 (1980) 142-150. 

2. R.C. Bose, S. S. Shrikande, and E. T. Parker, Further results on the construction of mutually 
orthogonal Latin squares and the falsity of Euler’s conjecture, Canad. J. Math. 12 (1960) 189-203. 

3. J. Dénes and A. D. Keedwell, Latin Squares and their Applications, Academic Press, New York; 
Akadémiai Kiad6, Budapest; English Universities Press, London, 1974. 

4. J. Dénes, G. L. Mullen, and S. J. Suchower, A note on power sets of Latin squares, J. Combin. 
Math. Combin. Comput. 16 (1994) 27-31. 

5. A. D. Keedwell, On orthogonal Latin squares and a class of neofields, Rend. Mat. e Appl. (5) 25 
(1966) 519-561. 


564 UNSOLVED PROBLEMS [June—July 


6. A. D. Keedwell, Concerning the existence of triples of pairwise almost orthogonal 10 x 10 Latin 
squares, Ars Combin. 9 (1980) 3-10. 

7. A.D. Keedwell, On R-sequenceability and R,-sequenceability of groups, Combinatorics ’81 (Rome 
1981) 535-548, North-Holland Math. Stud. 78, North-Holland, Amsterdam—New York, 1983. 

8. A.D. Keedwell, Circuit designs and Latin squares, Ars Combin., 17 (1984) 79-90. 


Csaba u.10 
H-1122 Budapest 
Hungary 


ee 


NOW DAVID, PERHAPS 
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10th PROBLEM? 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, M. J. Pelling, Richard Pfiefer, Leonard Smiley, John Henry 
Steelman, Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems. aid solutions should be sént in. isdiiplicate: to ‘the’ MontHLy 
problems address on the inside front cover. “Submitted problems: should inelude 


before November 30, I 997. ‘Additional information, such as. “peneralizations-a and 
referénces, is welcome. The problem number and the solver’s name and addtess 
should ‘appear on each solution. An acknowledgement will be sent only ifa mailing 
label is provided. An asterisk (*) after the number of a problem or a part of a 
problem indicates that ‘no solution is currently available. 


PROBLEMS 


The problems in the May issue should have been numbered from 10592 through 10598 

but were inadvertently numbered from 10585 through 10591, duplicating the numbers from 
the April issue. We ask our readers to use the intended numbers 10592-10598 for solutions, 
reference, and indexing for problems in the May issue. 
10599. Proposed by Fred Galvin, University of Kansas, Lawrence, KS. Let x1, ..., Xm and 
y1,--+, Yn be nonnegative numbers and let (a;;) be an m x n matrix of nonnegative numbers 
with at least one nonzero entry in each row. Suppose that the inequality )°)_, anjxn < 
> k=1 ik Yk holds whenever a;; > 0. Show that D°7h 1 xi < Dij-1 Yy- 


10600. Proposed by Franz Rothe, University of North Carolina, Charlotte, NC. 

(a) Suppose a triangle has its vertices at integer lattice points in the plane and contains 
exactly 3 integer lattice points in its interior. Show the center of mass of the triangle is not 
an integer lattice point. 

(b)* Find all values of i such that, if a triangle has its vertices at integer lattice points in the 
plane and contains exactly i integer lattice points in its interior, then the center of mass of 
the triangle cannot be an integer lattice point. 


10601. Proposed by Wen-Xiu Ma, Universitdét-GH Paderborn, Paderborn, Germany. Let 


n > 1 be an integer and let a), a2, ..., dy, be complex numbers. Show that 
l1 a a? vee gn! 
l 2 2n—1 
a2 ay eee ay 
2 2n—1 
Gn a, an (n—1)/2 4 
= (-1)" (a; — aj)" 
2n—2 Jj 
1 2a; +++ (2n—1)a;"~ ola 
1 2a. + (Qn- a2 ~2 
O 1 2a, ++» (Qn—1)a2"-? 
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10602. Proposed by Dan Sachelarie, ICCE Bucharest, Romania, and Vlad Sachelarie, The 
Ohio State University, Columbus, OH. Ina triangle ABC, let H be the orthocenter, J the 
incenter, O the circumcenter, N the nine-point center, r the inradius, and R the circumradius. 
Prove that 4HIO > 1/2 + arcsin./2r/R, with equality if and only if HIN = 1/2. 


10603. Proposed by Yury J. Ionin and Robin R. Lewis, Central Michigan University, 
Mt. Pleasant, MI. Leta, b, and k be positive integers and let P; (a, b) be the period of the 
sequence {a” mod bk ae Find limg-so9 Pe41(a, b)/ Pe(a, b). 


10604. Proposed by Joseph Rosenblatt, University of Illinois, Urbana, IL. 
(a) Determine positive constants c and C such that if 0 < a < b then 


sin(ax) sin(bx) a 
(5) up ge pe SOC) 


(*) 


*>0 


(b)* What are the largest constant c and the smallest constant C such that (*)\holds whenever 
0O<a<b? 


10605. Proposed by Jonathan M. Borwein and C. G. Pinner, Simon Fraser University, 
Burnaby, BC, Canada. Letr and m be positive integers and define 


P,(m) = I] nm 


r r 
ném © +m 


(a) Show that P;(m) = 0 and that 


Py(m) = (—1y"*1 ( mT ares 


(b) Show that P2(m) = (-1)"+!2m / sinh(szm) and that, more generally, P2;(m) is given 
by 


. (-1)/ 
(— jy 2m 7 (sinh mx)" (cosh (22m sin (= )) — cos (20m cos (=))) 


j=l 
where € = (1 + (—1)*)/2. 


SOLUTIONS 


Equilateral Cevian Triangles 


10358* [1994, 76]. Proposed by Jiang Huanxin, student, Fudan University, Shanghai, 
China. In triangle AABC, find all points P such that the triangle ADEF (with D = 
APN BC, E=BPNCA, F = CPN AB) is equilateral. 


Solution by David Goering, Eastern Washington University, Cheney, WA. We restrict our- 
selves to the case where the points D, E, F belong to the line segments BC, AC, AB, 
respectively, in which case we say ADE F is inscribed in AABC. 

The strategy is to characterize all equilateral triangles ADE F that can be inscribed in 
AABC, and then give a condition under which AD, BE, and CF are concurrent. Once 
this is done, we find P and prove its uniqueness. 

All triangles up to similarity may be represented by the coordinates A(0, 0), B(1, 0), 
C(b, c), with ce > 0 and 0 < b < 1/2. Let F be a point on AB with coordinates F (xo, 0). 
Then E on AC has coordinates 


xobtan@ xoctand 
btang@—c’ btand—c 
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where tan @ is the slope of EF. If we now let M be the midpoint of EF and let s be the 
length of EF, we find the coordinates of D with the vector addition 


D=M+ YB sising. —cos¢). 


This makes ADE F equilateral. We now have 


D xo( (2b + /3c) tang — c) cxo( tan p — V3) 
2(btan¢ —c) ' 2(btand—c) } 
Since we wish D = (D,, Dy) to lie on BC, for fixed ¢, we find xo so that Dy - (b — 1) = 
c-(D, — 1). Solving this equation for x9 and substituting this value into the coordinates of 


D, E, and F gives coordinates for the equilateral triangle DE F. Letting d = /3 — /3b+ 
c—(1+b+4+ ¥3c) tan ¢, we have 


—(2b + /3c)tang+e —c(tang — V3) 


D 9 9 
d ¢ (*) 
E —2btang@ —2ctand F 2(c — btan d) 0 
d ° d d dy 


The case @ = 7/2 is easily found by taking limits. Note that while every inscribed 
equilateral triangle is given by (*) for some value of ¢, there are values of @ for which 
ADE F is not inscribed. That is, ADE F is an equilateral triangle whose vertices are on the 
extended lines AB, AC, and BC. Also, ADEF now has an orientation such that point D 
‘is to the right as FE is traversed from F to E. Finally, we note the geometric impossibility 
of an equilateral triangle with 0 < tan’@ < V3, since this forces segment F D to lie on or 
below AB. 

We now express the equations of AD, BE, and C F in standard form and note that these 
lines are concurrent if and only if det M = 0, where M is the matrix of coefficients from 
the equations of the three lines. For the sake of brevity, we use the slope-intercept form, 
but the algebraic condition requires writing the slope of each of these lines in the form 
(x + ytand)/(w + ztan®@) and clearing denominators. Denoting the slope of a line PQ 


by mpg, we have 
map —1l 0 
M = (ms —] —MBE ) 
mcr —1 c—bmcr 


After simplification, the equation det M = 0 reduces to 


(—2b + 2b? — V3c + V3bc — 3c’) tan? @ + (5c — 13bc — V3c?) tan? 
+(6b — 6b? — V3c + V3be + 5c’) tang + (—3c + 3bc — V3c”) = 0 


Let x = tang and let p(x) = k3x? + kox* + kx + ko denote the above cubic in tan @. 

We now show that, for all admissible values of b and c, p(x) has exactly one root that 
yields an inscribed ADE F. Then P can be found as the intersection of any two of the lines 
AD, BE, and CF, and lies within AABC. 

We begin by observing that k3 = 2b(b — 1) + V3c(b — 1) — 3c? < 0, and p(0) = 
3c(b — 1) — V3c? < 0. This assures a negative real root. 

Now consider the behavior of p(x) atx = —./3. We have p(—vV3) = 24c(1—2b) > 0, 
which requires that there be at least one real root in [-v3, 0). If AABC is isosceles 
(b = 1/2), then p(-vV3) = 0. When tang = —V/3, F, = 1/2 and mgp = 0, as we 
would expect. If b < 1/2, p(—V3) > 0. We also have p'(-V3) = —16c? + 4./3c(9b — 
5) + 12b(b — 1) < 0, and p’(—V3) = —12./3b? — 44be + 16/3c? + 123d + 28¢ = 
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12./3b(1 — b) + 4c(7 — 11b) + 16./3c* > 0, for c > 0 and 0 < b < 1/2. Since p(x) is 
nonnegative, decreasing, and concave up at x = —4/3, p(x) has no roots in (—o0, —4/3). 
Thus all negative roots of p are in [—/3, 0). 

In a similar manner, it can be shown that, for x = /3, p(x) is negative, decreasing and 
concave down. Thus, p(x) has no roots in [V3, oo). Since ADEF cannot be inscribed for 
0 < tang < V3, p(x) has no roots in [0, 00) that yield an inscribed triangle. 

To show that p(x) cannot have more than one root in [-v3 , 0), we suppose that it does 
and argue by contradiction. Since p(0) < 0, p(x) must have a local maximum in (—/3 ; 0), 
and it must be the case that p’(0) < 0 and p”(0) < 0. Since p’(0) = k; and p”(0) = 2k, 
we consider the graphs of kj < O and 2k2 < 0. The former region is bounded by a hyperbola 
in b and c; the latter by two intersecting lines. From their graphs, it is clear that p’(0) and 
p”(0) are never simultaneously negative in [0, 1/2) x (0, co), the geometrically relevant 
region. Thus, p(x) has exactly one root in [—/3, 0). 

We now show that the equilateral triangle determined by setting tan equal to the negative 
root of p(x) is inscribed in AABC, ie., that D, E, and F defined by (*) belong to the 
respective sides of AABC, rather than simply to their extensions. First, note that —-/3 < 
tand < 0 implies that the coordinates of D, E, and F are all nonnegative (d = /3(1 — 
b)+c—(1+b+ 3c) tang > 0). It can be shown algebraically that E, < b, which 
requires point E to lie on the open segment AC. It can also be shown algebraically that 
F,, > 1 implies that D, > b. The remaining possibilities are 1) D and E belong to AABC 
while F,, > 1, and 2) E and F belong to AABC while D, < b. 

Suppose that F, > 1. Thenthe point A DN BE lies in the interior of AABC, butCFNBE 
lies outside AABC, so AD, BE, and CF are not concurrent.. But this contradicts the fact 
that tan ¢ is a solution to detM = 0. Thus, F, < 1. Similarly, if D, < b, BENQ FC is 
inside AABC, but BE NM AD is not, so again the three lines are not concurrent, which is a 
contradiction. This establishes that points D, E, and F lie on the open segments BC, AC, 
and AB, respectively, when tan @ is given by the negative root of p(x). 

We have thus established that p(x) has exactly one root in [-V/3 , 0) and that this root 
determines an equilateral triangle inscribed in AABC. Furthermore, the lines AD, BE, 
and CF are concurrent and meet inside AABC at point P. For 0 < b < 1/2 andc > 0, 
we have 
p 2 tan $( (2b + V3c) tan —c) 

* 3 — 3b + V73c + 2(-V3 + V3b — 3c) tang + (1+ 3b + 3V3c) tan? ¢ 
and 
¢(tan bd — V3) 
n (2b + /3c) tan —c 


where tan @ is the unique negative real root of p(x). 


Py = 


Editorial comment. Two partial solutions were received. Charles H. Jones established the 
existence of a suitable choice of P as follows. For each D in the segment BC, choose 
P in the segment AD and determine E and F as in the statement of the problem. There 
is a unique choice of P for which ZEDF = 1/3. The curve of such points P(D) goes 
from B to C as D goes from B to C, and |DF| — |DE| changes sign on this curve. By 
the intermediate value theorem, there exists a choice with |DF| = |DE|. He also studied 
the existence of solutions when ADE F is not required to be inscribed in the sense of the 
selected solution. If the point P is on any of the extended sides of AABC, or on the line 
parallel to one of the sides through the opposite vertex, then the ADE F is never equilateral. 
Removing these six lines leaves 16 connected regions that can be studied by analytic means. 
Some conditions for the existence of solutions in each of these regions were obtained. 
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Helen M. Marston considered only inscribed equilateral triangles, and suggested using 
Ceva’s theorem to arrive at an equation characterizing the point P. 


A Recurrence Whose Solution Changes Sign Infinitely Often 


10396 [1994, 681]. Proposed by Ioan Tomescu, University of Bucharest, Bucharest, Roma- 
nia. Leta > 0 and let (b, : n => 1) be defined recursively by b} = a, bz = 3a, and 

bal = (2n + 1)bn —(n? +0°)bn-1 (n> 2). 
Prove that (b,) contains infinitely many positive and infinitely many negative terms. 


Solution I by Richard Stong, Rice University, Houston, TX. We show that b,, is the imaginary 
part of c,, where c, = (1 + ia)(2+ia)---(m +ia). First note that c, satisfies the given 
recurrence. To see this, write 


= =(n+1+ia)(n + ia) = (2n+ 1)(n + ia) + (—n + ia)(n + iw) 
n—1 
= (2n + 1)—- — (n? +0). 
Cn—-1 


Since the recurrence is linear with real coefficients, b,, is also a solution. One then easily 
checks that it has the desired initial values. 

The required property of b, is now clear. The atgument of cy, is )-;_, arctan(a/k). This 
sum diverges like the harmonic series since all terms are between 0 and 1/2 and, for large k, 
the summand is asymptotic to a/k. Therefore, c, must be in each quadrant of the complex 
plane infinitely often and, in particular, b,, must take on both signs infinitely often. 


Solution II by Richard Holzsager, American University, Washington, DC. Let qn = by, /bn-1. 
The recursion is then gnj4,; = 2n + 1 — (n2 + a) /dn. If dn < O, then b, and b,_; have 
opposite signs; while if g, is 0 or oo, then by = O for k = n orn — 1, so the original 
recurrence shows that by41 and bg_; have opposite signs. Thus it suffices to show that there 
are arbitrarily large n for which gq, is not a positive real number. It is convenient to write 
dn = dn —n. Then 


d =n — ——_ = ———_ = d, —- ——", 
nin n+dn n+ dn " n+dn (+) 


and we need to show that assuming d, > —n for all n > N leads to a contradiction. 
With this hypothesis, (*) implies that dy > dy+1 > dy+2 >---. But then, forn > N, 
(a2 + d?)/(n + dn) > a?/(n+ dy). Since > a*/(n + dy) is divergent, it follows that d,, 
diverges to —0o. 
For 0 > dy, > —n, |dn+il = dal (1+ |dnl /n), so 
ldn+1| > ldy| 1+ |d,| /n 
n+l n 1+1/n 
Since |d,,| is eventually greater than. 2, one has for all large n 
1+ |d,|/n 1+2/n 
1+1/n 1+1/n 
These form a divergent product, so the d,,/n must also diverge to —oo, contradicting the 
assumption that d, > —n. 


Solution III by Donald A. Darling, Newport Beach, CA. Define asequence {a,,},n = 1,2,... 
by setting b, = n!ay,. Let f(x) = )-?-., anx” be the generating function for the sequence 
{a,}. We show that 


f(x) = 


1 
—_ sin («og i ~). |x| < 1. (*) 
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This function has infinitely many zeros in the interval 0 < x < 1, namely at the points 
xj = 1—exp(—2jm/a), j = 0,1,2,..., so the extension of Descartes rule of signs to 
power series shows that a,,, and hence by, has infinitely many changes in sign. This extension 
of Descartes rule of signs may be found in G. Pélya and G. Szegé, Problems and Theorems 
in Analysis, Vol. II, Springer-Verlag, 1972, Part 5, Chapter 1. 

To prove (*), we first set b, = n!a,, in the difference equation for {b,}, and after some 
routine simplification, we obtain 


a2 
—_——a 
n(n + 1) 
with a; = a and a2 = 3a/2. Multiplying the recurrence (**) by n(n + 1)x"—!, summing 
for n > 2, and using the initial values for a and a2 yields the differential equation 
(1 — x)? f"(a) — 30 — x) f’@) + 1+ *) f(x) = 0 
through straightforward calculation. This is a classical Euler differential equation in the 


variable (1 — x), which can be transformed into an equation with constant coefficients. The 
solution with the initial conditions f(0) = 0, f’(0) = a; =a, is given by (x). 


Ant1 — 2d, + An-1 = — (dn — An—1) — n—1s (n > 2) (*) 


n+1 


Editorial comment. Several solvers observed that all nonzero solutions of the recurrence 
have the required property. In particular, Leandro Cagliero obtained this result using the 
method of Solution II. He also remarked that the solution of the recurrence with a = 0, 
b, = A, and b2 = Bis b, = ((3A — B)+(B-—2A) a1 1//) )n!, which has constant 
sign for all sufficiently large n. 

The solution of Th. B. van Dulken was based on a comparison theorem for difference 
equations analogous to those for ordinary differential equations. This theorem allowed 
him to obtain some general results on the oscillatory behavior of solutions of difference 
equations. 

Solved also by J. Alvarez (Spain), J. Anglesio (France), D. Borwein (Canada), L. Cagliero (Argentina), Th. B. van Dulken 


(Australia), P. G. Kirmser, J. H. Lindsey II, S. C. Locke, L. E. Mattics, I. Nemes (Austria), A. Nijenhuis, V. Novakov 
(Bulgaria), M. Vowe (Switzerland), A. N. ’t Woord (The Netherlands), WMC Problems Group, and the proposer. 


Asymptotic Behavior of a Nonexpansive Sequence 


10404 [1994, 792]. Proposed by Behzad Djafari Rouhani, Shahid Beheshti University and 
Islamic Azad University, Tehran, Iran. Let x,, x2, ... be a sequence of real numbers such 
that |x; — x;| > |xi+1 — xj41| for all positive integers i, j with |i — j| < 2. Prove that 
(Xn /n) converges to a finite limit as n — oo. 


Solution by Allen Stenger, Irvine, CA. The sequences { |x, — Xn+1| } and { |x, — Xn+2| } are 
nonincreasing and bounded below by zero, so they must approach limits. If either limit is 
ZETO, 1.€., Xn — Xn41 = O(1) Or Xn — Xn+2 = O(1), then by telescoping summation we infer 
that x, = o(n), hence {x,/n} approaches zero. 

Therefore assume neither limit is zero. We prove the stronger statement that {x, — Xn+41} 
approaches a finite limit, from which we again infer by summation that {x,,/n} approaches 
a limit. 

Denote the limit of { |x, — x,41| } by L (where ZL > 0), and write x, —x,41 = Ls,+0(1), 
where s, = +1. Then 


Xn — Xn+2| = | Xn — Xn) + Ont — Xn42) | = L [Sn + Sn4i1| + 01). 


Since this approaches a nonzero limit, we infer that s,, must have a constant sign after some 
point and is therefore a constant s. Thus, x, — X,4, = Ls + 0(1) as claimed. 


Editorial comment. The proposer noted that the problem arose from a study of the theorem 
of A. Pazy, Asymptotic behavior of contractions in Hilbert space, Israel J. Math. 9 (1971), 
235-240. 
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Solved also by R. Barbara (Lebanon), D. Borwein (Canada), P. Budney, R. J. Chapman (U. K.), D. Cui, J.-P. Grivaux 
(France), R. Holzsager, G. Keselman & R. R. Goldberg, J. H. Lindsey II, O. P. Lossers (The Netherlands), H. Morris, 
A. Nijenhuis, C. G. Petalas & T. P. Vidalis (Greece), C. Popescu (Belgium), F. Richman, K. Schilling, M. Shemesh 
(Israel), R. Stong, A. A. Tarabay (Lebanon), NSA Problems Group, Prague Problem Solution Group (Czech Republic), 
USA Problems Group, WMC Problems Group, and the proposer. 


Brianchon, Desargues, Pascal 


10405 [1994, 793]. Proposed by Herbert Giilicher, Westfélische Wilhelms-Universitét, 
Miinster, Germany. Let A, A2A3A4A5A¢6 be a hexagon circumscribed about a conic, and 
form the intersections P; = Aj Aj+42 M Aj41Aj+3 Gi = 1,..., 6, all indices mod 6). Show 
that the P; are the vertices of a hexagon inscribed in a conic. 


Solution by Albert Nijenhuis, University of Pennsylvania (Emeritus), Philadelphia, PA, and 
University of Washington, Seattle, WA. Since A; A2A3A4A5Ag¢ is circumscribed about a 
conic, by Brianchon’s Theorem the lines A; Aj+3 (i = 1, 2, 3) are concurrent. Let O be the 
common point. 

By Desargues’s Theorem, since the triangles A; Aj42Aj44 (i = 1,2) are perspective 
(with respect to ©), the intersections of the pairs of corresponding sides A3A5 M AgAz2, 
A,As5 QM A4A2, and A; A3 MN A4A¢@ are collinear. 

By (the converse of) Pascal’s Theorem, since the pairs (P; P2, P4 Ps), (P2P3, Ps Pe), and 
(P3P4, P6P;) intersect in collinear points, the points P,,..., Ps lie on a conic. 


Editorial comment. R. H. Jeurissen remarked that each of the three classical theorems used 
here gives an equivalence between two conditions, so that the argument can be reversed to 
show that A; A2A3A4A5A¢ is circumscribed about a conic if P|,..., Pg lie on aconic. In 
fact, this converse is the projective dual of the present problem, since the construction of 
sides of Aj AzA3A4A5Ag¢ fromthe sides of P P2 P3 P4 Ps Pe is dual to the given construction. 
Even the proof is self-dual since Desargues’s Theorem is self-dual and the other two are 
duals of each other. The proposal was accompanied by a proof by Giinther Pickert using 
properties of the Steiner points of an inscribed hexagon in place Desargues’s Theorem. 
Further details about these classical theorems of projective geometry can be found in H. S. 
M. Coxeter, Introduction to Geometry, second edition, Wiley, 1969, Heinrich Dorrie, 100 
Great Problems of Elementary Mathematics, Dover, 1965, and D. Pedoe, A Course of 
Geometry, Cambridge, 1970. 


Solved also by J. Anglesio (France), M. Benedicty, R. H. Jeurissen (The Netherlands), O. P. Lossers (The Netherlands), 
G. Pickert, C. G. Petalas & T. P. Vidalis (Greece), C. Popescu (Belgium), M. Reid, R. Tauraso (Italy), and the proposer. 


Grid Paths Split Area in Half 


10406 [1994, 793]. Proposed by David C. Fisher, University of Colorado, Denver, CO, 
Karen L. Collins, Wesleyan University, Middleton, CT, and Lucia B. Krompart, Rochester, 
MI. Suppose a path on an m-by-n square grid starts at the northwest corner, goes through 
each point exactly once, and ends at the southeast corner. Show that such a path divides the 
grid into two equal halves: (a) those regions opening north or east, and (b) those regions 
opening south or west. 


Solution I by Roger B. Eggleton, Illinois State University, Normal, IL, and R.J. Simpson, 
Curtin University of Technology, Perth, Western Australia. Embed them xn gridinanm xn 
array of unit squares centered at the grid points. Extend each end of the curve by a half-unit 
horizontal segment outward to the edge of the array. (See the figure at the top of the next 
page.) Including the squares adds congruent L-shaped areas to the two regions described, 
so it suffices to show that the augmented regions A and B have equal area. 

In traversing the curve, region A is to the left and region B is to the right. When the curve 
turns right in a cell, it adds 3/4 to the area of A and 1/4 to the area of B. When it turns left, 
the contributions are 1/4 and 3/4, respectively. Cells with no turn are split in half. Since 
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the curve does not intersect itself and has parallel initial and final segments, the number of 
left turns equals the number of right turns. Thus the areas are equal. 


Solution II by Michael Reid, Brown University, Providence, RI. By Pick’s Theorem, the area 
of a polygon with vertices at lattice points depends only on the number i of interior lattice 
points and the number b of boundary lattice points. (It equals i + b/2 — 1, but we don’t 
need this.) 

Complete the path between (1, n) and (m, 1) to a polygon by adding segments from the 
ends to (0, 0). Alternatively, add segments from the ends to (m + 1,n + 1). The polygons 
augment the desired regions by congruent figures, so it suffices to show that they have the 
same area. Each has mn + 1 boundary points and no interior points, so by Pick’s Theorem 
the areas are equal. 


Editorial comment. Most solvers used Pick’s Theorem, which is discussed in Griinbaum 
and Shepard, Pick’s Theorem, this MONTHLY 100 (1993), 150-161. Stephen H. Schanuel 
observed that the argument using Pick’s Theorem applies in the more general problem where 
the segments of the path are not constrained to the grid lines. 

The desired result also follows from Grinberg’s Theorem in graph theory. Adding acurve 
from (1, 2) to (m, 1) outside the grid completes a spanning cycle in a planar graph with mn 
vertices. The unbounded face and the long face inside the cycle have length m+n —1. The 
remaining faces are 4-cycles. Grinberg’s Theorem states that (length(F’) — 2) has the same 
sum over the faces F inside the cycle as it does over the faces outside, and therefore there 
are as many squares inside as outside. 


Solved also by J. C. Barthelmes, D. Beckwith, M. Benedicty, S. Brandt (Germany), L. Cagliero & J. Lauret (Argentina), 
R. J. Chapman (U. K.), R. Ehrenborg (Canada), J. W. Grossman, R. Holzsager, R. H. Jeurissen & B. Polman (The 
Netherlands), U. Klein (Germany), N. Komanda, J. Kupka (Australia) & M. Leinert (Germany), K. M. Levasseur, 
J. H. Lindsey II, O. P. Lossers (The Netherlands), D. Marcus, R. F McCoart, A. Nijenhuis, C. Popescu (Belgium), 
R. C. Read (Canada), R. M. Robinson, S. H. Schanuel, R. Stong, D. Wolfe, Anchorage Problem Solutions Group, NSA 
Problems Group, Prague Problem Solution Group (Czech Republic), WMC Problems Group, and the proposers. 


The Effect of Truncation 


10419 [1994, 1014]. Proposed by Bill Correll, Jr, Denison University, Granville, OH. Let 
k be an integer greater than or equal to 3. Let S(k) be the set of nonnegative real numbers 
x for which 


e+ bee] |e] 


(a) Determine the largest integer in S(k). 
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(b) Show that S(k) is the union of a finite number of intervals with the sum of the lengths 
of those intervals equal to (k? — 3k + 6)/2. 


Composite solution by O. P. Lossers, Technical University Eindhoven, Eindhoven, The 
Netherlands, and National Security Agency Problems Group, Fort Meade, MD. The largest 
integer in S(k) is k(k — 2). Since the equation holds for x if and only if it holds for |x |, it 
suffices to determine the integers in S(k). 

The desired equation is AD + |x/k] = BC + |x/k —1], where A = |(x +k —2)/k], 
B= ((x+k—1)/k],C = l(wa+k—2)/(K—1)], and D = [(x+k—-1)/(k—-1)]. 
When x is an integer, we have A = B —e€ andC = D — é€’, where is 1 if x = 1 modk 
and 0 otherwise, and é’ is 1 if x = 0 mod (k — 1) and 0 otherwise. 

When A = B andC = D, we require |x/k]| = |x/(k — 1)]. Letting x = a(k — 1) +B, 
the equation holds when 0 < a < b < k—1. We must exclude b € {0,a +1}. There 
remain ar b = (k — 1)(k — 2)/2 solutions, of which x = (k — 2)k is the largest. 

When A = B — 1 andC = D, we require |x/k| = |x/(K -—1)]| + D> |x/(k—-1)]. 
This inequality fails for x > 0 and k > 3, so there are no solutions in this case. 

When A = B andC = D — 1, we require |x/k| + L(x +k —1)/k| = |x/(kK-1)]. 
Also x must be divisible by k — 1 and not congruent to 1 modulo k. This holds only when 
xisOork—1. 

When A = B—1landC = D—1, werequire |x/k|]+ L(x +k—1)/k] = |x/(kK-1)|+ 
L(x +k —1)/(k —1)]. Also x = 0 mod (k — 1) and x = 1 mod k. Since x is a multiple of 
k —1, we have (x +k—1)/(kK-—1)] > L(a+k-—-1)/k], and always |x/(k —1)] > |x/k]. 
Thus equality cannot hold in this case. 

Altogether, we obtain (k — 1)(k — 2)/2+2 = (k? — 3k + 6)/2 unit-length half-open 
intervals where the condition holds, with the largest integer being k(k — 2). 

Solved also by J. C. Binz (Switzerland), R. J. Chapman (U. K.), J. Christopher, J. H. Lindsey IT, D. K. Nester, A. A. Tarabay 
(Lebanon), D. B. Tyler, A. N. ’t Woord (Netherlands), WMC Problems Group, and the proposer. 


Sums of Two Squares and a Cube 


10426 [1995, 70]. Proposed by Noam Elkies, Harvard University, Cambridge, MA, and 
Irving Kaplansky, Mathematical Sciences Research Institute, Berkeley, CA. Show that any 
integer can be expressed as a sum of two squares and a cube. Note that the integer being 
represented and the cube are both allowed to be negative. 


Solution by Andrew Adler, University of British Columbia, Victoria, British Columbia, 
Canada. 


2x +1 = (x? — 3x7 4x)? + (x? —x — 1) — (x? — 2x) 

dx +2 = (2x? — 2x? — x)* + (2x? — 4x? — x + 1)? — (2x? — 2x — 1)° 

8x +4 = (xe +x 42) + (x? — 2x — 1) — (x? +:1)° 

16x + 8 = (2x? — 8x7 +. 4x 4.2)? + (2x° — 4x? — 2)* — (2x7 — 4x)? 
16x = (x? + 7x — 2)? + (x? + 2x + 11)? — (x? +.5)° 


Editorial comment. Other identities were supplied by readers, but all solutions used a similar 
division into cases. John P. Robertson notes that the representation for odd integers follows 
from Theorem 2 on page 113 of L. J. Mordell, Diophantine Equations, Academic Press, 
1966. He and the proposers mention related open problems, including sums of a square and 
two cubes. 


Solved also by J. P. Robertson, A. N. ’t Woord (The Netherlands), and the proposers. 
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A Polynomial Identity 


10466 [1995, 654]. Proposed by E. Sparre Andersen & Mogens Esrom Larsen, University of 
Copenhagen, Copenhagen, Denmark. For x € C andn €N, prove the following identities 
between polynomials. 


nw (44+ 1/2\ (n-1—x\  (22\ QA (x4+i\( *-i 
” ary j nF) = CE Cay )(os=3,) 


For all m € N with O < m < 2n, generalize (a) to 


Z. (x +1/2\ (n-1-<x an\ "Vell fy tj x—j 
VC on7 = Cn), 2 (254m) (on in 2) 
j=0 j=—[m/2!] 
Solution by Robin J. Chapman, University of Exeter, Exeter, UK. Since (b) reduces to (a) 
when m = 0, it suffices to prove (b). Both sides are polynomials in x of degree at most 2n, 
so it suffices to verify that they are equal for at least 2n + 1 distinct values of x. We use the 


n integers and the n + 1 half-integers in the interval [—1/2, — 1/2]. We show that both 
sides equal 0 at the integers and equal 4~” (*") at the half-integers in this range. Let 


n 


wn (4 +1/2\ (n-1—x eee x+j x—j 
r= > j )( Qn — j ane sn= De (4 2))(on m2) 


First consider f. When0 < x <n—1land0 < j <n, wehave2n—j > n—1—x. When 

x is an integer in this range, we thus have (on) = Qand f(x) =0. Whenx =r—1/2 

var! ’, Since the 
Vandermonde convolution (7) = )>j~o (5) (37) is valid for real y and nonnegative integer 
r, we have f(r — 1/2) = ("5) = (-16)~" (2), as desired. 

We next show that g,,(x) = O when x is an integer in [0,n — 1]. If g(x) 4 0, then 
some summand aH /) (on no ) must be nonzero. Both factors must be nonzero; the first 
requires a)x + j < Oorb)x+ j > m+ 2), and the second requires c) x — j < Oor 
d) x — j > 2n —m —2j. Each choice produces a contradiction: (a&c) = > 2x < 0; 
(b&d) ==> 2x > 2n; (a&d) => 2n —m <0; (b&c) => m < 0. Thus each term is 0. 

It remains to evaluate g,, (x) at half-integer values. We first subtract 0 in the form 


IGEN Crt itty) (J ~h 
hm (x) = ( . )( . ) 
j=-l@4t)/2| m+2j+1/\2n-—m—2j-1 


The summand for h,, (x) is that for g(x) with j + 1/2 substituted for 7. Since x + 7 +1/2 
is now integral, the same case analysis with j + 1/2 in place of j shows that every term in 
the sum is 0 except possibly when x + 7 + 1/2 < Oand x — j — 1/2 < 0. This requires 
x = —1/2, 7 <0, and j > —1, which again is impossible. 

We now let km (x) = 2m(x) —hm(x). By setting s = m+ 27 in the summand for g,, (x) 
and s = m + 2(j + 1/2) in the summand for h,,(x) and observing that m — s is even for 
terms in the first sum and odd for terms in the second, we obtain 


2n m—s m—Ss 
__qym Nays (% ~ Net 
km(x) = (-1) 2 v*( . \( 2). 


We claim that k(x) = 4” for all x. Rearranging the factors in computing binomial 
coefficients yields 


eC =e? _"?) 
( 5 ’n—s J  s\(Qn—s)! m 2n—m ) 
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with r a nonnegative integer, the sum for f(x) reduces to }""_9 ()( 


Define $(z) = (@1”)/) and w(z) = (4). With y = 2x, we obtain 


2n—m 


—1 2n 


2 2 
km (x) = 1 ( ") yio'( ")60 —s)\¥(y ts). 
m s=0 AY 


From the calculus of finite differences, it follows that the sum yo (—1) (2")sP is 0 
when p is an integer with 0 < p < 2n and is (2)! when p = 2n. For nonnegative integers 
p,q, the expression (y — s)?(y +s)? is ahomogeneous polynomial of degree p + g. Thus 


the sum 
2n an 
ees ( ; Jo —s)?(y +s)? 
s=0 


is 0 when p+ q < 2n and equals (—1)?(2n)! when p + q = 2n. 

Since @(z) and w(z) are polynomials in z, we can express the summand in k(x) as 
a linear combination of terms of the form (y — s)?(y + s)?. Since the leading terms of 
(z) and w(z) are 2” /(2™m!) and z2"-™ /(22"-™ (2n — m)!), respectively, the only nonzero 


contribution is 
2n\~! (—1)"(2n)! 
k — (—])”" $$ = aon 
me) = (1) (") 2 m\22"—" (2n — my! 


Solved also by S. B. Ekhad, C. Krattenthaler (Austria), Con Amore Problem Group (Denmark), and the proposers. 


Solutions Need Not Tend to Zero 


10498 [1996, 75]. Proposed by Ray Redheffer, University of California, Los Angeles, CA. 
Consider the system of differential equations 


dx d 
i —(x +.a(t)y) > = —(b(t)x + y) (*) 
where a(t) and b(t) are positive, continuous, and bounded for 0 < t < oo. 

If (sup a(t))(supb(t)) < 1, it is easy to prove that all solutions of (*) tend to 0 as 


t —> oo. Does the same conclusion follow if one assumes only that sup (a (t)b(t)) <1? 


Solution by the University of South Alabama Problem Group, Mobile, AL. We give an 
example to show that the answer is no. Take a(t) to be periodic with period 1 and define 


36 ifO0<t< 1/4; 
a(t) = 36 — (1439/5)(t — 1/4) if 1/4 <t < 3/8; 
~ | 1/40 if 3/8 <t < 7/8; 


1/40 + (1439/5)(t 7/8) if7/8<t <1. 


Now set b(t) = a(t — 1/2) and compute sup (a(t)b(t)) = 9/10. Take x(0) = 1 and 
y(0) = —1 for the initial conditions and set u(t) = e’x(t) and v(t) = —e' y(t). Then the 
initial value problem becomes: 


du dv 
r = a(t)v, 7 =b(t)u, u(O0)=1, v(0)=1. 


Clearly u(t) and v(t) are increasing functions. Using the mean value theorem we have 
u(n + 1/4) — u(n) > 36v(m) - 1/4 and v(m + 3/4) — vv + 1/2) > 36u(n + 1/4) - 1/4 
forn = 0,1,2,.... Hence u(n + 1) — u(n) > 9v(n) and vin + 1) — v(v) > 9u(n) for 
n=0,1,2,...,sowegetu(n) > 10” and v(n) > 10” by induction. Hence x(n) > e~”" 10", 
y(n) < —e~"10", and we have our example. 


Solved also by P. Alsholm (Denmark), J. C. Bronski, R. Kelsey, J. H. Lindsey II, L. Scribani (South Africa), and the 
proposer. 
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Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


The Life of Stefan Banach. By Roman Kaluza. Translated and edited by Ann Kostant 
and Wojbor Wocyzynski. Birkhauser, Boston, 1996, x + 137, $24.50. 


Reviewed by Sheldon Axler 


In an least one printing of the current (fifteenth) edition of the Encyclopedia 
Britannica, the entry on Stefan Banach did not contain the words “Poland” or 
“Polish”. The Britannica called Banach a “Soviet mathematician.” The encyclope- 
dia fixed its error in later printings, but the mathematics community has not yet 
adequately documented Banach’s life and ideas. A computer search of Mathemati- 
cal Reviews reveals more than eleven thousand publications with the word “Banach” 
in the title; “Hilbert” occurs in only seven thousand titles. Yet no mathematician 
or historian of mathematics has produced a book-length biography of Stefan 
Banach. 

The book under review was written neither by a mathematician nor by a 
historian. The author, a Polish reporter and journalist, writes well about mathe- 
matics without using any mathematical symbols. Professional mathematicians will 
spot a few technical errors of the type that inevitably creep into exposition at this 
level. For example, we read that “the only linear transformations” on a finite- 
dimensional Euclidean space are “translations, rotations, and reflections.” Such 
small mistakes in mathematical details can easily be forgiven because the author 
does a good job of capturing the flavor of early functional analysis and its creators. 

The book suffers more from the lack of a historian’s perspective than from an 
absence of mathematical expertise. Some events described in the book cry out for 
more explanation. For example, consider the author’s description of the Nazi 
efforts to eliminate the intelligentsia in occupied Poland during World War II. 
Before capturing the Polish university town of Lvov, where Banach lived and 
worked, German officials compiled a list of prominent professors, scientists, and 
writers in Lvov who would be executed. One night shortly after German soldiers 
had entered Lvov, SS units murdered forty leading intellectual figures in Lvov 
without even the pretense of trials. But Banach was untouched by the Nazi death 
squads. An alert reader will wonder why Banach, who at this time was President of 
the Polish Mathematical Society and a Dean at the university, was not among the 
intellectuals marked down for liquidation. Unfortunately the author does not 
comment on the apparent disparity between his description of Nazi plans to crush 
Polish intellectual life and the survival of Banach, Poland’s most influential 
mathematician. Was Banach spared because he had too much fame? Or were the 
occupying forces so mathematically illiterate that they had never heard of Banach? 
The author does not even speculate about these questions that beg to be answered. 

As another example of a tantalizing tidbit from the book that needs more 
explanation, consider the following account (p. 51) of Banach’s support for the 
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mathematical logician Leon Chwistek: 


when at some point Chwistek applied for a position in logic in Lvov, Banach 
backed him unequivocally and helped him to obtain the post. The affair 
scandalized half of intellectual Poland since Chwistek, in addition to being a 
respected scholar, also had a well-deserved reputation as being a somewhat 
strange and very eccentric artist. 


Banach himself was “somewhat strange” and “eccentric”; that description surely 
fits many mathematicians. So why would Banach’s support for such a person have 
“scandalized half of intellectural Poland”? Readers will realize that something 
more must have been involved here, but the author provides no hints to help solve 
this mystery. 

In 1928 Stefan Banach and his colleague Hugo Steinhaus founded Studia 
Mathematica, which quickly became the most important journal specializing in the 
then new field of functional analysis. Today’s mathematics librarians, grappling 
with budget problems, will be amused to learn that the first volume of Studia 
Mathematica cost $1.50 outside Poland. 

When teaching the graduate course in functional analysis, I always use the 
Krein-Milman Theorem and its appearance in Studia Mathematica as an excuse to 
inject a bit of history into the classroom. The Krein-Milman Theorem states that in 
a locally convex topological vector space, every compact convex set is the closed 
convex hull of its extreme points. This result was published (in somewhat less 
generality than the version just stated) in the 1940 volume of Studia Mathematica, 
which also contained two papers written by Banach. That volume of the journal 
was printed on poor-quality paper, clearly due to wartime conditions. The most 
curious feature of the 1940 volume is that each article (they are all written in either 
English, French, or German) appears with an abstract in Russian. Obviously Lvov, 
where Studia Mathematica was published, lay in the Soviet zone of occupation at 
the time of publication. Two weeks after Germany had invaded Poland from the 
west in September 1939, the Soviet Union marched into Poland from the east. 
Poland was partitioned between Germany and the Soviet Union until the summer 
of 1941, when Germany attacked the Soviet Union and occupied all of Poland. 

The 1940 volume of Studia Mathematica was the last one edited by Banach, who 
died at age 53 shortly after World War II ended in 1945. After an absence of eight 
years, Studia Mathematica resumed publication in 1948 in Wroclaw. Poland’s 
border had moved westward after World War II, so that Lvov was then in the 
Soviet Union (no doubt this accounts for the Britannica’s claim that Banach was a 
“Soviet mathematician”). A few years ago Lvov again changed countries—it is now 
part of Ukraine. Today Studia Mathematica, still a fine journal specializing in 
functional analysis, is published in Warsaw. The cover of each issue still proudly 
bears the names of the founding editors, Banach and Steinhaus. 

In 1932 Banach published his famous book Théorie des Opérations Linéaires, 
based on his Polish version published a year earlier. Remarkably, Théorie des 
Operations Linéaires remains in print today more than six decades after its original 
publication, partly because of its historic value as the first monograph on func- 
tional analysis but also because of the clean, modern style with which Banach 
presents the fundamentals of the subject (as created in good part by him and his 
collaborators). While a graduate student, I read Théorie des Opérations Linéaires to 
study for my French exam. I remember the thrill of seeing functional analysis 
developed by a legendary hero of twentieth century mathematics and my delight in 
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his extraordinarily clear writing. I also remember my amusement that what we 
today call “Banach spaces” are called “spaces of type (B)”’ in Banach’s book. From 
the book under review I learned that Banach had previously written several 
popular high school mathematics textbooks for use throughout Poland; perhaps 
writing for a high school audience had honed Banach’s excellent expository skills. 

The Life of Stefan Banach \eft me hungry for more information about this 
fascinating figure. However, the author has performed a valuable service by 
uncovering some previously unknown data about Banach and by interviewing many 
of the dwindling number of people who knew Banach. This sketchy biography is a 
good place to start for someone wanting to learn about Banach. 


Department of Mathematics 
Michigan State University 
East Lansing, MI 48824 
axler@math.msu.edu 


101 Careers in Mathematics. Edited by Andrew Sterrett. Mathematical Association of 
America, 1996, 250, $20. 


Reviewed by J. Kevin Colligan 


101 Careers in Mathematics is going to be ammunition for both sides of a lot of 
discussions. Andy Sterrett did a marvelous job of collecting an impressive variety of 
stories from people with degrees in mathematics and related disciplines. These 
people are now working in a stunning range of occupations. They include an 
astronaut, several college professors, marketing consultants, teachers at the ele- 
mentary and secondary levels, mathematicians in government service (I have to put 
in a plug for my own “brand” of mathematician!), engineers, computer scientists, 
actuaries, a health professional, a Deputy Assistant Secretary of Defense, and even 
the Director of Inventory Control for L. L. Bean. Mathematics majors go every- 
where and do everything. 

On the whole, at least based on the 101-person sample in this book, they’re 
fairly successful, too. And every one of them attributes his or her success and 
happiness to their experience in their undergraduate partial differential equations 
course! 

If you believe that last statement, you need to get outside for a stronger dose of 
reality. They are in fact fairly successful and happy, but it wasn’t the PDE course 
that did it, at least not exclusively. Some recurring themes in their comments stood 
out for me; I’ll point them out to get you thinking about these items, and perhaps 
to start a discussion. Here’s what I heard. 


(0) There seemed to be a reasonable balance across gender and degree level. 

(1) A lot of these people did, and continue to do, a lot of computer-related 
work. 

(2) Many say the “mathematical way of thinking” or the crystallization of their 
thinking skills was the best gift they received from their education. 

(3) Some—but not as many as I would have expected or hoped—said commu- 
nication skills and people skills count. 
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(4) Several referred to the current need to perform in a teaming environment, 
especially cross-disciplinary ones. 

(5) There was one reference to an expansion of individual and collective 
abilities: I have a few thoughts on this that Pll put off until the end. 


Let’s take these in turn. 

(0) Gender and degree balance. I didn’t go through the book building a 2-by-3 
contingency table to compare degree level (roughly 3) against gender (roughly 2). 
There was a time when I might have done this, but my pile of reading material is 
too tall right now; if you decide to do it, let me know what comes out. It’s balanced 
by gender, and I presume I'd see about the breakdown by degree level that recent 
statistics indicate: about half of the bachelor’s degrees go to women, maybe 40% of 
the master’s degrees, and about 25% of the doctorates. The fact that those 
graduate percentages are increasing so slowly continues to bother me. I’ve heard 
all the reasoned explanations, rationalizations, excuses, and finger-pointing. We 
have to put all that aside and solve the problem. No, I don’t know the cure-all 
here, either, but some places are being successful: let’s clone what works in those 
programs. 

This book showcases a lot of successful women; so does She Does Math!, 
another excellent book on this topic, also published by the Association. Copies of 
these books should be in every post-secondary institution in the country. Maybe 
every high school, too. I can now hear you now telling me to get outside for a 
stronger dose of reality. Well, it’s important to dream, isn’t it? Let’s use all the 
talent we have in this country. 

(1) Computer-related work. Computers fly our planes, control our cars, run (or 
at least facilitate) our economy, run (no qualifier needed) our infrastructure, 
amuse us, teach us, and drive us nuts. Computers and knowing how to use them 
are integral parts of most if not all of the careers described in this book. We’ve 
made tremendous strides in using and applying this technology to all our jobs. 
We’re still struggling to integrate them, to find the correct niche for technology. 

At one end we have questions such as, should all knowledge be marketable? 
The results of the human genome project? Phone number databases? Purchase 
profile databases? Where does one draw the line? Our technological development 
has clearly outstripped our moral development. To make proper and prudent use 
of the technology we already have, we need to refine our understanding of the 
individual and collective ethical framework for our actions. 

At the other—or at least nearer—end, how should we bring technology (prim- 
arily but not exclusively computers) into mathematics instruction? Students are 
going to use (will have to use!) these tools when they get a job. How are we 
fulfilling our ethical and professional duty to prepare them for their personal and 
professional lives? ve heard the argument that one loses proficiency by relying on 
a “computer crutch.” In the October, 1996 issue of NCTM’s Mathematics Teacher, 
Ken Ross said: 


Since it is easier to measure and spot deficiencies in skills than in under- 
standing, this decline can easily be overemphasized. This problem is serious, 
however, especially since our future scientists, engineers, and mathematicians 
must obtain both substantial understanding and substantial skills. The reform 
movements need to address this issue. 
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Whether one is skilled at or challenged by mathematics, ’d much prefer having 
someone come away with an appreciation for how to approach a problem, and 
what resources can and ought be brought to bear to solve it. Frederick L. Frostic, a 
Deputy Assistant Secretary of Defense, said, “Those who know how things work 
always seem to work things best.” Teach the skills; teach the technology: they’ll 
need ’em both. The people in this book do. 

(2) The mathematical way of thinking. Most of the 707 Careers people said that 
their most valuable and utile acquisition as a mathematics major was, how to think. 
Sometimes we make mathematics look like a duck. We show the finished product, 
the calm and beauty above the waterline, without showing all the paddling 
underneath that gets us where we’re headed. If we want to develop that mathemat- 
ical way of thinking—what I believe Wade Ellis calls the HOTS (Higher Order 
Thinking Skills)—we’d better show some of the scaffolding along with the finished 
product. I have been told that George Mackiw, a friend of mine from Loyola 
College in Maryland, occasionally stops in class after completing a proof and says, 
“Now I want you to step back for a moment and appreciate the beauty of that 
proof.” I like that. It encourages one to look at the edifice at different granulari- 
ties. That’s a core competency for a mathematician. 

This way of thinking, the ability to both abstract and refine as needed, is one of 
the most valued gifts that instructors can give to all mathematics students. Teach 
them to analyze, to generalize, to look for patterns, but most of all, teach them to 
think. Ron Bosquet, R&D Product Manager for Hewlett-Packard, said, “My 
mathematics training... taught me how to learn” and he is only one of many who 
say essentially the same thing. 

(3) Communication and people skills. Some people in the book singled this out 
as being important. Some did, but not as many as I expected. Well, at least no one 
Said it was unimportant. I would have put this first in my list: it’s that important to 
me and to the environment I live in, and I speculate that it’s implicitly part of 
everyone’s environment. 

Some of the 101 said that these skills enable them to communicate between 
technical and non-technical constituencies, and I say, hooray for them! I’ve seen 
and worked with people at both ends of this spectrum. The ones that succeed are 
those who can communicate well orally and in writing (at least one person said 
graphically, too), who can explain technical reasoning to a non-technical audience, 
who understand that priorities—and hence decisions—can be determined by views 
and feelings and not always by facts. These individuals are the ones whose voices 
matter, the ones that can be relied on to cut to the quick of complicated issues. 
They’re the ones who make a difference. It’s more fun to make a difference. 

So how does this bear on what happens in the classroom? Well, let’s think not 
only about how we personally give stellar examples of crystal-clear communication 
and interpersonal skills, but also about how we encourage this behavior in 
students. Are there group activities? Oral presentations? Writing assignments? Do 
we encourage and reward clarity and grammar? How about asking for an “execu- 
tive summary” of a proof or argument? 

Of course, all of these skills are interconnected. As professional mathemati- 
cians, we owe it to our colleagues, students, customers, or whomever to build and 
use these skills ourselves, and aid and abet their development in others. 

(4) Teaming. This is another apple pie and motherhood issue. In my opinion, 
the days of Herculean solitary efforts are gone. (Okay, I have to think about how 
to finesse Andrew Wiles’s work into this mold.) Has anyone done a count of the 
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average number of authors per paper submitted to journals lately? Has anyone 
studied this over time? Problems—at least most of the ones that the 101 Careers 
people talked about—are multidisciplinary. 

I like working on teams. I generally learn more mathematics, people thank me 
for providing the expertise I bring to the team, and I often get introduced to a 
whole new class of problems that I didn’t even know about. Done right, teams 
create synergy that invigorates all team members. 

It’s not an innate skill that everyone’s born with, though. The “parallel play” 
that is often seen in very young children sometimes translates into “parallel work” 
in adults, even in a supposedly team-oriented environment. It seems to be 
something we can be taught to do, however: there are quite a few entrepreneurs 
out there offering $800-a-pop courses that claim to develop teaming skills. Mathe- 
matics courses should develop teaming skills as well. Investigate, question, experi- 
ment, even guess: in a team that has developed mutual trust, these activities can 
accelerate learning and show different perspectives on both solutions and stum- 
bling blocks. There have been many times when I thought I understood something, 
only to have someone else’s question make me realize there was more going on 
than I originally saw. More than once, such experiences opened my eyes to 
structure I didn’t even expect. 

Teaming is useful: we can help teach it. 

(5) Expansion of collective abilities. Here’s a quotation from Sherra Kerns, 
Chair of the Electrical Engineering and Computer Science Department at Vander- 
bilt University: “Computerization [is] a facilitation and expansion of our individual 
and collective capabilities, a revolution at the edge of global opportunities affect- 
ing each touched life.” I say, so is mathematics. [Soapbox alert!] It encourages us 
to develop global and local perspectives. It brings us the satisfaction of solving 
problems, of seeing patterns, often of simply fixing things, and it brings this to us 
both as individuals and in groups. In the words of Samson Cheung, a research 
scientist at NASA Ames, “Mathematics can provide one with the tools to use one’s 
imagination.” 

I’m going to raise a moral issue here again. The unreasonable applicability of 
mathematics to the real world has an ethical dimension. Issues of access to 
mathematical training are not moot: mathematical knowledge empowers people. 
Harlan Mills, an industrial consultant, said “I have been continually surprised at 
the level of mathematics education and maturity in successful colleagues in 
apparently nonmathematical positions or activities.” Likewise, mathematical igno- 
rance can disenfranchise them. Like it or not, we are gatekeepers. That brings with 
it obligations to discern carefully and thoughtfully our proper role, and act on it. 

All in all, I liked this book: it made me think about the breadth of my 
profession, and that forced me to think about its depth as well. The appendices, 
reprinted from Math Horizons, the MAA’s student magazine (which you should 
also order), give great suggestions for job applicants and interviewers alike. Get a 
copy of 1/01 Careers in Mathematics to see what comes out the far end of the 
mathematics education pipeline. It’s thought-provoking. 


J. Kevin Colligan 

National Security Agency 

Fort George G. Meade, MD 20755-6709 
jkev@romulus.ncsc.mil 


582 REVIEWS [June—July 


TELEGRAPHIC REVIEWS 


Edited by Arnold Ostebee 


with the assistance of the Mathematics Departments of 
Carleton, Macalester, and St. Olaf Colleges 


Telegraphic Reviews are designed to alert readers in a timely manner to new books 
appropriate to mathematics teaching and research. Special codes classify reviews by 
subject area and appropriate use: 


T : Textbook P : Professional Reading 
C : Computer Software L : Undergraduate Library 
13: Grade Level 


1-4: Semester 
*#* * Special Emphasis 
?? : Questionable 


S : Supplementary Reading 
Readers are advised that price information is subject to change. Selected books 
receive a second, more extensive review in the Monthly. 


Books submitted for review should be sent to Book Reviews Editor, American.Mathe- 
matical Monthly, St. Olaf College, 1520 St. Olaf Avenue, Northfield, MN 55057-1098. 


General, P, L. Insights of Genius: Imagery 
and Creativity in Science and Art. Arthur I. 
Miller. Springer-Verlag, 1996, xxii + 482 pp, 
$27. [ISBN 0-387-94671-3] “To see is to un- 
derstand.” A thorough explanation of the role 
of visual images in prompting imagination and 
creativity, using 20th-century physics and art as 
case studies. Includes a fascinating discussion 
of the profile of creativity of Henri Poincaré 
prepared by his contemporary, the psychologist 
Edouard Toulouse. LAS 


Reference, P. Handbook of Brownian 
Motion—Facts and Formulae. Andrei N. 
Borodin, Paavo Salminen. Prob. & Its Applic. 
Birkhauser Boston, 1996, xiv + 462 pp, $129. 
[ISBN 0-8176-5463-1] First part summarizes 
theory of linear diffusions. Second part is ta- 
ble of distributions of functionals of Brownian 
motion and related processes. 


Reference, P, L*. A Compendium on Nonlinear 
Ordinary Differential Equations. P.L. Sachdev. 
Wiley, 1997, xi + 918 pp, $125. [ISBN 0- 
471-53134-0] Information about closed form 
solutions, asymptotics, stability, existence, and 
numerical results. 


Reference, P. Learning I4TpX. David F. Grif- 
fiths, Desmond J. Higham. SIAM, 1997, x + 
84 pp, $15.50 (P). [ISBN 0-89871-383-8] A 
clear, simple, up-to-date, sometimes amusing 
introduction to IAfpX—brief yet surprisingly 
comprehensive. Covers basics, graphics, bibli- 
ographies, indexes, slides, electronic resources, 
differences between “old” and “new” I4TpX, etc. 
Rich with brief but pertinent examples. PZ 


Recreational Mathematics, P. /ntriguing 
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Mathematical Problems. Oswald Jacoby, 
William H. Benson. Dover, 1996, x + 191 pp, 
$6.95 (P). [ISBN 0-486-29261-4] Republi- 
cation of Mathematics for Pleasure (McGraw- 
Hill, 1962). 


Recreational Mathematics. Math and Logic 
Puzzles for PC Enthusiasts. J.J. Clessa. Dover, 
1996, x + 131 pp, $5.95 (P). [ISBN 0-486- 
29192-8] A slightly altered republication of 
Micropuzzles (Pan Books, 1983). 135 puzzles; 
some intended to be done using a computer. 


Education, P, L. Characterizing Pedagogical 
Flow: An Investigation of Mathematics and Sci- 
ence Teaching in Six Countries. William H. 
Schmidt, et al. Kluwer Academic, 1996, xiv 
+ 229 pp, $110. [ISBN 0-7923-4272-0] Re- 
sults from a pre-TIMSS investigation of typical 
classroom practice of mathematics teachers in 
grades 4 and 8. Chief conclusions: In France 
and Spain teachers organize and present for- 
mal, complex subject matter; in Norway and 
Switzerland classes are characterized by student 
exploration; in Japan the emphasis is on multi- 
ple approaches to carefully selected examples; 
and in the United States teachers present infor- 
mation and direct student activities. LAS 


Education, P. Bold Ventures, Volume 3: Case 
Studies of U.S. Innovations in Mathematics 
Education. Eds: Senta A. Raizen, Edward 
D. Britton. Kluwer Academic, 1996, xiii + 
376 pp, $140. [ISBN 0-7923-4233-X] Exten- 
sive analysis of three reform efforts in mathe- 
matics education: the development of standards 
by NCTM, the innovative modelling-based pre- 
calculus course developed by the North Car- 
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olina School of Science and Mathematics, and 
the Urban Mathematics Collaboratives estab- 
lished by the Ford Foundation. Based on inter- 
views with participants and observers, as well 
as visits to programs and classrooms. Intended 
as case studies in effecting change. LAS 


History, S(16-18), P. Mathematical Encoun- 
ters of the Second Kind. Philip J. Davis. 
Birkhéduser Boston, 1997, viii + 304 pp, 
$24.95. [ISBN 0-8176-3939-X] Encounters 
of the “second kind” are with the people of 
mathematics (the “first kind” being reserved for 
mathematics itself). Four meandering reminis- 
cences, with some fiction mixed in, about em- 
peror Napoleon’s gift to mathematics, mathe- 
matician Stefan Bergman, and biochemist Lord 
Victor Rothschild. Written in Davis’s engaging 
personal style. LAS 


Foundations, T(18: 1), P. Vicious Circles: 
On the Mathematics of Non-Wellfounded Phe- 
nomena. Jon Barwise, Lawrence Moss. CSLI 
Lect. Notes, No. 60. Center for the Study 
of Language & Information (Leland Stanford 
Junior Univ, Stanford, CA 94305) & Cam- 
bridge Univ Pr, 1996, x + 390 pp, $24.95 (P); 
$49.95. [ISBN 1-57586-008-2; 1-57586-009- 
0} An introduction to hypersets, a concept 
that transcends sets. Explains recent results 
from mathematics, computer science, and phi- 
losophy. Includes applications to game theory, 
graph theory, semantical paradoxes, automata, 
and languages. DB 


Foundations, P. Kreiseliana: About and 
Around Georg Kreisel. Ed: Piergiorgio 
Odifreddi. AK Peters, 1996, xiii + 495 pp, 
$60. [ISBN 1-56881-061-X] Essays for 
Georg Kreisel’s 70th birthday: reminiscences, 
explanations of his work in logic and philoso- 
phy, technical papers in logic. DB 


Combinatorics, P. Embeddability in Graphs. 
Liu Yanpei. Math. & Its Applic. (China Ser.). 
Kluwer Academic, 1995, xvi + 398 pp, $198. 
[ISBN 0-7923-3648-8] 


Number Theory, P. Introduction to Cy- 
clotomic Fields, Second Edition. Lawrence 
C. Washington. Grad. Texts in Math., V. 83. 
Springer-Verlag, 1997, xiv + 487 pp, $59. 
[ISBN 0-387-94762-0] New edition of this 
classic on p-adic L-functions includes recent 
work of Thaine, Kolyvagin, and Rubin, as well 
as application of Jacobi sums to primality test- 
ing. (First Edition, TR, February 1983.) DB 


Group Theory, T(18: 1), P. 3—Transposition 
Groups. Michael Aschbacher. Tracts in Math., 
V. 124. Cambridge Univ Pr, 1997, vii + 260 pp, 
$49.95. [ISBN 0-521-57196-0] Part I con- 
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tains the first published proof of Fischer’s clas- 
sification of almost simple groups generated 
by 3-transpositions. Parts II and III establish 
structural results on Fischer groups and pro- 
vide a foundation for the theory of sporadic 
groups. TH 


Group Theory, P. Symplectic Fibrations and 
Multiplicity Diagrams. Victor Guillemin, Eu- 
gene Lerman, Shlomo Sternberg. Cambridge 
Univ Pr, 1996, xiv + 222 pp, $49.95. [ISBN 
0-521-44323-7] 

Group Theory, P. Symmetric Inverse Semi- 
groups. Stephen Lipscomb. Math. Surveys & 
Mono., V. 46. AMS, 1996, xviii + 166 pp, $49. 
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gaswamy. Lect. Notes in Pure & Appl. Math., 
V. 182. Marcel Dekker, 1996, xii + 411 pp, 
$165 (P). [ISBN 0-8247-9789-2] Proceed- 
ings of a 1995 conference held in Colorado 
Springs, Colorado. 
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cations. Eds: Karl H. Hofmann, Michael W. 
Mislove. London Math. Soc. Lect. Note Ser., 
V. 231. Cambridge Univ Pr, 1996, ix + 165 pp, 
$29.95 (P). {ISBN 0-521-57669-5] Invited 
survey papers from a 1994 conference at Tu- 
lane University. 

Algebra, P. Torsion Theories Over Commuta- 
tive Rings. Willy Brandal, Erol Barbut. BCS 
Associates, 1996, ix + 112 pp, $28 (P). [ISBN 
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nikov. Math. & Its Applic., V. 379. Kluwer 
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Algebra, P. Infinite-Dimensional Lie Groups. 
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tion. Dale Varberg, Edwin J. Purcell. Pren- 
tice Hall, 1997, xv + 975 pp. [ISBN 0-13- 
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technology-based problems and projects. 
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$19.45 (P). [ISBN 0-7872-2858-3] Projects 
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tions of several variables, and on parameterized 
surfaces. First three chapters introduce Math- 
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sample projects. Final chapter provides hints 
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of Real Functions, Fourth Edition. Ralph P. 
Boas. Revised: Harold P. Boas. Carus Math. 
Mono., No. 13. MAA, 1996, xiv + 305 pp, 
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[ISBN 0-486-69219-1] Republication in one 
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Partial Differential Equations, P. Elliptic 
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tributions. Yakov Roitberg. Math. & Its Ap- 
plic., V. 384. Kluwer Academic, 1996, xi + 
415 pp, $219. [ISBN 0-7923-4303-4] 


Partial Differential Equations, P. Numerical 
Approximation of Hyperbolic Systems of Con- 
servation Laws. Edwige Godlewski, Pierre- 
Arnaud Raviart. Appl. Math. Sci., V. 118. 
Springer-Verlag, 1996, viii + 509 pp, $59.95. 
[ISBN 0-387-94529-6] 


Dynamical Systems, T(16—18), P. Discrete 
Hamiltonian Systems: Difference Equations, 
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Calvin D. Ahlbrandt, Allan C. Peterson. Texts 
in Math. Sci., V. 16. Kluwer Academic, 1996, 
xiv + 374 pp, $175. [ISBN 0-7923-4277-1] 
Thorough introduction includes discussion of 
the relationship of continued fractions to solu- 
tions of Riccati equations. Other topics include 
linear symplectic systems, discrete variational 
theory, symmetric three-term recurrences, and 
Bohner theory. MPR 


Dynamical Systems, P. Dynamic Systems on 
Measure Chains. V. Lakshmikantham, S. Siva- 
sundaram, B. Kaymakcalan. Math. & Its Ap- 
plic., V. 370. Kluwer Academic, 1996, x + 
285 pp, $144. [ISBN 0-7923-4116-3] 


Numerical Analysis, P, L. Numerical Recipes 
in Fortran 90: The Art of Parallel Scientific 
Computing, Second Edition. William H. Press, 


1997] 


TELEGRAPHIC REVIEWS 


et al. Cambridge Univ Pr, 1996, xx + 551 pp, 
$44.95. [ISBN 0-521-57439-0] Program list- 
ings, with hints on parallelization, for the For- 
tran 90 version of the Numerical Recipes li- 
brary. Also includes an introduction to For- 
tran 90 and parallel programming. AO 


Operator Theory, P. Monotone Operators in 
Banach Space and Nonlinear Partial Differen- 
tial Equations. R.E. Showalter. Math. Surv. & 
Mono., V. 49. AMS, 1997, xiii + 278 pp, $75. 
[ISBN 0-8218-0500-2] 


Operator Theory, P. The Asymptotic Distribu- 
tion of Eigenvalues of Partial Differential Op- 
erators. Yu. Safarov, D. Vassiliev. Transl. of 
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Equations in Abstract Spaces. Dajun Guo, V. 
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[ISBN 0-8218-0422-7] 

Analysis, T(17), P. An Introduction to the 
Mathematical Theory of Inverse Problems. An- 
dreas Kirsch. Appl. Math. Sci., V. 120. 
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tion, TR, August-September 1993; Extended 
Review, June—July 1994.) RM 


Differential Geometry, P. Global Analysis in 
Mathematical Physics: Geometric and Stochas- 
tic Methods. Yuri Gliklikh. Transl: Viktor L. 
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Stochastic Processes, P. Ergodicity for In- 
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V. 229. Cambridge Univ Pr, 1996, xi + 339 pp, 
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Stochastic Processes, T?(16—17: 1), P. Com- 
petitive Markov Decision Processes. Jerzy Fi- 
lar, Koos Vrieze. Springer-Verlag, 1997, xii + 
393 pp, $69. [ISBN 0-387-94805-8] Unified 
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decision processes (viewed as special cases of 
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Computer Science, P. Computer Facial Ani- 
mation. Frederic I. Parke. AK Peters, 1996, xv 
+ 365 pp, $59.95. [ISBN 1-56881-014-8] 


Computer Science, P, L. Applications on Ad- 
vanced Architecture Computers. Ed: Greg Ast- 
falk. SIAM, 1996, xvii + 359 pp, $35 (P). 
[ISBN 0-89871-368-4] Collection of articles 
originally published in SIAM News between 
March 1990 and June 1995. Many of the ar- 
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Computer Science, P. Problems of Reducing 
the Exhaustive Search. Eds: V. Kreinovich, G. 
Mints. AMS Transl., Ser. 2, V. 178. AMS, 
1997, x + 189 pp, $79. [ISBN 0-8218-0386- 
7] 12 papers on propositional satisfiability and 
related problems. 


Applications (Communication Theory), P. 
Signals & Systems, Second Edition. Alan V. 
Oppenheim, Alan S. Willsky, S. Hamid Nawab. 
Signal Proc. Ser. Prentice Hall, 1997, xxx + 
957 pp. [ISBN 0-13-814757-4] (First Edi- 
tion, TR, June-July 1983.) 


Applications (Economics), T(16: 1), P, L. Fi- 
nancial Calculus: An Introduction to Deriva- 
tive Pricing. Martin Baxter, Andrew Rennie. 
Cambridge Univ Pr, 1996, ix + 233 pp, $39.95. 
[ISBN 0-521-55289-3] Introduces basic ideas 
in discrete-time setting, then generalizes to con- 
tinuous time. Applications to actual financial 
instruments and the interest rate market. AO 

Applications (Engineering), C, P. Fourier- 
Related Transforms, Fast Algorithms and Ap- 
plications. Okan K. Ersoy. Prentice Hall, 1997, 
xix + 522 pp, with disk. [ISBN 0-13-624412-2] 


Applications (Engineering), P. Mathemati- 
cal and Numerical Modelling in Electrical En- 
gineering Theory and Applications. Michal 
Kri{Zek, Pekka Neittaanmaki. Math. Mod.: The- 
ory & Applic., V. 1. Kluwer Academic, 1996, 
xiii + 300 pp, $149. [ISBN 0-7923-4249-6] 


Applications (Physical Science), P. Mathe- 
matics of Microstructure Evolution. Eds: Long- 
Qing Chen, et al. SIAM, 1996, ix + 391 pp, 
$50 (P). [ISBN 0-89871-386-2] 31 papers 
from a 1995 symposium in Cleveland, Ohio. 
Applications (Physics), T(18: 2), P. Confor- 
mal Field Theory. Philippe Di Francesco, Pierre 
Mathieu, David Sénéchal. Grad. Texts in Con- 
temp. Physics. Springer-Verlag, 1997, xxi + 
890 pp, $89. [ISBN 0-387-94785-X] 


Applications (Physics), P. Geometry and 
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Physics. Eds: Jgrgen Ellegaard Andersen, et 
al. Lect. Notes in Pure & Appl. Math., V. 184. 
Marcel Dekker, 1997, xxii + 745 pp, $185 (P). 
[ISBN 0-8247-9791-4] Invited papers from a 
series of four workshops, a summer school, and 
a conference held in 1995 in Denmark. 


Applications (Physics), S(15~17). Mathemat- 
ics for Physicists. Philippe Dennery, André 
Krzywicki. Dover, 1995, xiii + 384 pp, 
$12.95 (P). [ISBN 0-486-69193-4] Slightly 
corrected republication of 1967 Harper & Row 
text (TR, August-September 1968). 


Applications (Systems Theory), P. Stability 
Theory. Eds: R. Jeltsch, M. Mansour.  In- 
tern. Ser. of Num. Math., V. 121. Birkhauser 
Boston, 1996, vii + 249 pp, $122.95. [ISBN 0- 
8176-5474-7] Papers from a 1995 conference 
in Ascona, Switzerland marking the centennial 
of Hurwitz’ paper on the location of roots of a 
polynomial. 


Applications, T(17~18), S, L. Distributions 
in the Physical and Engineering Sciences, Vol- 
ume 1: Distributional and Fractal Calculus, 
Integral Transforms and Wavelets. Alexander 
I. Saichev, Wojbor A. Woyczynski. Appl. & 
Num. Harm. Anal. Birkhauser Boston, 1997, 
Xvili + 336 pp, $45. [ISBN 0-8176-3924-1] 
A modern course in ‘Advanced Mathematics 
for Engineers and Scientists.’ Unifying theme 
is distribution theory, enriched with topics such 
as wavelets, nonlinear phenomena, white noise 
theory. Background needed in elementary dif- 
ferential equations, linear algebra, Fourier se- 
ries, complex variables. Knowledge of Math- 
ematica, Maple, or MATLAB useful for stu- 
dent projects. Well-written; examples, exer- 
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Applications, T*(15-18). Applied Mathemat- 
ics, Second Edition. J. David Logan. Wiley, 
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16513-1] Solid follow-up to a very good First 
Edition (TR, April 1988). Major changes in- 
clude additions to perturbation methods, divi- 
sion of ODE’s and PDE’s into different chap- 
ters, and improvements to coverage of integral 
equations. Chapters on similarity and finite dif- 
ference methods have been deleted. Overall, an 
outstanding choice of topics. A fine book. MPR 
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GEOMETRY TURNED ON 


DYNAMIC SOFTWARE FN LEARNENG, TEACHENG, AND RESEARCH 


Research 


THE RAT HO SATICAL ASSOCIMUION OF AMERILA 


Dynamic geometry is active, exploratory geometry 
carried out with interactive computer software. It has 
had a profound effect on classroom teaching wherever 
it has been introduced and has become an indispens- 
able research tool for mathematicians and scientists. 
The papers in this volume give a good idea of the ways 
in which the software can be used, and some of the 
effects it can have. It is clear that the software raises 
various questions for teaching and research, and its 
continuing evolution raises questions on the design of 
the software itself. 


With the use of interactive computer software, the focus 
in teaching shifts from students laboriously making con- 
structions by hand to verify a stated fact in a text (for 
which there seems little reason to produce a proof) to 

a focus on students carrying out experiments, quickly 
producing many accurate sketches from which they 
conjecture properties that seem to be “always” true. 


The latest dynamic geometry software takes advantage 
of current computer hardware with mouse interface 
for graphics and great speed. What is most exciting 
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Turned On 


Dynamic Software in Learning, Teaching and 


JAMES KING AND DORIS SCHATTSCHNEIDER, EDITORS 
Learn how you can use interactive computer software in the study 
and teaching of geometry 
Series: MAA Notes 


about the software is its dynamic nature. Thus, after 

a geometric configuration is drawn, any unconstrained 
parts of the configuration (arbitrary segments or points, 
for example), that are not dependent on any other 
objects are moveable — they can literally be grabbed 
with a cursor (using the mouse) and can be dragged or 
stretched — and as they move, all other objects in the 
configuration automatically self-adjust, preserving all 
dependent relationships and constraints. 


Although this volume is printed in a conventional man- 
ner, and every paper has illustrations, most of the illus- 
trations beg to be played with. We want you to be able 
to experience some of the explorations described by 
our authors. To make that possible, dynamic sketches 
that use Geometer’s Sketchpad or Cabri II have been 
made available by several of the authors and are posted 
on a Web page maintained by the Mathematics Forum. 
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If you teach calculus, you should read this book. If you 
want to know what mathematics your students understand, 
or if you want to know how to find out what they under- 
stand, this book contains essential information for you. 


It doesn’t matter whether you teach a reform or traditional 
course, whether you have large or small sections, or 
whether you use lectures or laboratories. The bottom line 
is the same: When all is said and done, what counts is 
what our students understand. And that’s what Student 
Assessment in Calculus is about. 


Over the last ten years calculus instruction has changed 
in numerous ways. Whether they were trying on new 
ideas or following the more traditional routes towards 
conceptual understanding, both individual faculty and 
departments needed to know if their instruction was 
effective. To help deal with that issue, the National Science 
Foundation brought together a Working Group of experts 
in students’ mathematical thinking, in assessment, and in 
calculus reform. The goals of their work were to: 
e develop a framework to tailor calculus instruction to the 
students’ needs; 
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in Calculus 


A Report of the NSF Working Group on 
Assessment in Calculus 


ALAN SCHOENFELD, EDITOR 
Series: MAA Notes 


e establish an agenda for further research on student 
understanding; 

e describe how to make use of a range of techniques to 
test what students know, such as multiple-choice tests or 
short essay questions, student portfolios and “clinical” 
interviews; 

¢ summarize major goals of the reform movement and 
describe the challenges faced by those who are taking a 
closer look at how students learn; 

e illustrate the ways in which calculus projects attempt 
(via exams, papers, projects, etc.) to find out what their 
students have learned. 


This book is the result of those efforts. If you teach calcu- 
lus, if you want to see examples of useful assessment tech- 
niques, or if you are interested in issues of how to mea- 
sure student learning in mathematics, then there is a lot for 
you here. 
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This volume grew out of the work of the 
Linear Algebra Curriculum Study Group (orga- 
nized by David Carlson, Charles R. Johnson, 
David C. Lay and A. Duane Porter), and the 
1993 Special Issue on Linear Algebra of the 
College Mathematics Journal (then edited by 
Ann Watkins and William Watkins). This book 
argues that the teaching of elementary linear 
algebra can be made more effective by empha- 
sizing applications, exposition, and pedagogy. 


e Relevant applications serve as motivation for 
students and as sources of stimulating and 
challenging problems. 

e Effective exposition that finds the right way to 
communicate concepts is especially important 
in the teaching of linear algebra, often the first 
course in which students come to grips with 
abstraction and complexity, and with multiple 
representations of the same idea. 


e Attention to pedagogy that takes into account 
how students learn, technology, and new 
teaching ideas such as cooperative learning 
can go a long way toward improving the 
teaching of linear algebra. 


This volume includes the recommendations of 
the Linear Algebra Curriculum Study Group, 
with their core syllabus for the first course, and 
the thoughts of mathematics faculty who have 
taught linear algebra using these recommenda- 
tions. It includes elucidation of these ideas, 
trenchant 

criticism of them, and a report on putting them 
into practice. 
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This book contains the best problems selected 
from over 25 years of the Problem of the Week 
at Macalester College. Readers will find here a 
collection of intriguing and thought provoking 
problems that will give students (high school or 
beyond), teachers, and university professors a 
chance to experience the pleasure of wrestling 
with some beautiful problems of elementary 
mathematics. 


Compare your sleuthing talents with those of 
Sherlock Holmes, who made a bad mistake 
regarding the first problem in the collection: 
Determine the direction of travel of a bicycle 
that has left its tracks in a patch of mud. The 
collection contains a variety of other unusual 
and interesting problems in geometry, algebra, 
combinatorics and number theory. For exam- 
ple, if a pizza is sliced into eight 45-degree 
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wedges meeting at a point other than the center 
of the pizza, and two people eat alternate 
wedges, will they get equal amounts of pizza? 
Or: What is the rightmost nonzero digit of the 
product 1-2 3--: 1000000? Or: Is a manufac- 
turer’s claim that a certain unusual combination 
lock allows thousands of combinations justified? 


Complete solutions to the 191 problems are 
included along with problem variations and 
topics for investigation. This collection will be 
especially valuable to teachers who are looking 
for stimulating ways to engage their students 
with the beauty and intrigue that can often be 
found in elementary mathematics. 
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7 is one of the few concepts in mathematics whose 
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The Many Avatars of a Simple Algebra 


S. C. Coutinho 


1. INTRODUCTION. Some mathematical structures show up in many different 
contexts, under many different guises. This is the case with the Weyl algebra. Born 
in the cradle of quantum theory, in the 1920s, it has come up in the representation 
theory of enveloping algebras and has played a key role in the creation of 
QG-module theory. It has recently returned to the parental home, under the 
auspices of deformation theory. 

In this paper we survey the incarnations of the Weyl algebra associated to 
several formalisms of quantum mechanics. Beginning with the moment of concep- 
tion in the 1920s, we work our way through matrix mechanics, Schrddinger’s 
equation and Dirac’s formalism. After a brief interlude where rings of differential 
operators are introduced, we return to quantum theory to look at quantisation by 
deformation and its version of the Weyl algebra. 


2. QUANTUM MECHANICS. The story begins in May 1925, when W. Heisenberg 
fell ill with a bout of hay fever so vicious that he decided to ask for a fortnight 
leave to recover. He chose the island of Heligoland as a place to escape to. He 
must have been in a dreadful state indeed, because the landlady of the inn where 
he stopped for breakfast assumed, from his looks, that he had been involved in a 
fight the night before, [21, p. 248 ff]. 

In Heligoland, between walks and baths, Heisenberg carried on the work he had 
started in Gottingen. He was trying to develop a quantum mechanics, and his 
fundamental intuition was that it should deal only with observable quantities. 
Starting from that, Heisenberg developed a mathematical formulation of the 
theory. However it was not clear at first whether the mathematical scheme would 
be consistent or not. 

Heisenberg felt that the real test of his scheme would be to check that it 
satisfied the law of conservation of energy. It took him a whole night to verify that 
energy was indeed conserved. Elated, he climbed a rock jutting out into the sea 
and watched the sun rise. 

Let us see how Heisenberg arrived at his scheme of quantum mechanics. 
Consider an electron moving in an atom. If the system were classical, then we 
would have a function x(t) describing the position of the electron as a function of 
time. We would also have Newton’s equation 


Xx + f(x) = 0. 


Heisenberg decided that this equation ought to be retained, but that it would be 
necessary to find a new interpretation for x(t). But the motion of the electron is 
periodic. Once again, if the system were classical, one could expand x(t) as a 
Fourier series. In this case, the coefficients of the series would represent the 
amplitudes. In the quantum case these coefficients should depend on a quantum 
number. Developing the mathematical scheme along these lines, Heisenberg was 
led ‘almost necessarily’ to a very weird looking formula for the multiplication of 
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amplitudes. In particular, as he explicitly stated in his original paper, these 
amplitudes do not commute; a fact that deeply troubled him. 

At first Heisenberg hoped to remove the need for non-commutative amplitudes 
from his theory. Unable to ‘improve’ the paper, he decided to come out with it and 
handed it over to Max Born shortly before leaving for England, where he would 
speak at the Kapitsa Club in Cambridge. 


3. MATRIX MECHANICS. Born did not look immediately at Heisenberg’s 
manuscript. It was the end of term, he felt tired and ‘afraid of hard thinking’ [22, 
p. 8 ff]. However, when he read through it a few days later, he was fascinated. 
Born immediately began to work on Heisenberg’s ideas. By simplifying Heisenberg’s 
notation and re-writing the formulae for the multiplication of amplitudes he 
immediately realised that it was formally like the product of matrices. It is 
interesting to note that at the time matrices were not in the toolkit of every 
physicist. Luckily Born still remembered matrices from his student days in Breslau, 
twenty years back. 

Soon Born began his own ‘constructive work’. Denoting by p and q the 
momentum and position variables of Heisenberg’s picture, Born realised that pq 
and qp were different because p and q were matrices. He also noted that 
Heisenberg’s formulae gave only the diagonal entries of the commutator [p, q] = 
pq — qp, which had to be ih. Here % denotes Planck’s constant divided by 277. 

In Born’s own words: ‘repeating Heisenberg’s calculation in matrix notation, I 
soon convinced myself that the only reasonable value of the non-diagonal elements 
should be zero’ [27, p. 37]. Thus he arrived at the formula 


pq — gp = if1, (3.1) 


where 1 denotes the identity matrix. In his words, this formula was ‘only a guess, 
and my attempts to prove it failed’. 

A few days later, Born met Pauli, on the train between Gottingen and Hanover. 
Unable to resist his enthusiasm, he told Pauli about his matrices and his difficulties 
with the proof of (3.1). Instead of showing interest, as Born had expected, Pauli 
accused him of spoiling Heisenberg’s idea with ‘futile mathematics’ [27, p. 37]. 

Having failed to engage Pauli’s interest, Born turned to his former student P. 
Jordan. Working together, they developed Heisenberg’s idea in the context of 
matrix calculus. This is the first time that (3.1) appears in print, with a ‘proof’ due 
to Jordan [27, p. 277]. 

The version of quantum mechanics that follows from the work of Heisenberg, 
Born, and Jordan is called matrix mechanics. In it the momentum and position are 
represented by matrices. Denoting these matrices by p and q, respectively, the 
equations of motion for an electron moving in one dimension, under a potential, 
take the form 


dq/ dt = dH /op 


(3.2) 
dp/ot = —dH/0q 


where H is a function of p and q. These are Hamilton’s equations of motion, to 
which we return in the next section. The point to note here is that the equations 
involve two kinds of differentiation: by a scalar (time) and by matrices (p and q). 
The first poses no problem, but the same cannot be said of the second. We return 
to this question in §5. 
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4. HAMILTONIAN MECHANICS. Let us briefly review a few facts about hamil- 
tonian mechanics that we will require. Consider a particle of mass 1 moving along 
a straight line. Let q and p denote the position and momentum of the system. 
Since we have a classical system, these are numbers: the coordinates of phase 
space. Suppose that the particle is subject to a force F(q, t), which depends on 
position and time. 

Since the system is one dimensional, F can be derived from a potential V(q, 1), 
given by 


Vast) = ~ f F(q,t)aa. 


Hence the total energy of the system, which is the sum of the potential and kinetic 
energy of the particle is 
2 


Pp 
H(q, p,t)= > + V(q,4). 


This is called the Hamiltonian or Hamiltonian function of the system. By Newton’s 
second law 


Op 0H 
a PG) = —z- 
q 
On the other hand, a direct calculation shows that 
0g 0H 
at ap” 
The equations 
Op OH 
ats. 
4 (4.1) 
Og 0H 
ot ap’ 


are called Hamilton’s equations of motion. 

We have thus obtained Hamilton’s equations for a system that consists of a 
particle of unit mass moving on a straight line under a force F(q, t). In general, a 
Hamiltonian system of one degree of freedom is a second order system whose 
motion is determined by equations of the form (4.1). 

The quantities of classical mechanics are described in terms of infinitely 
differentiable complex-valued functions of p and q. For the sake of simplicity we 
shall restrict ourselves to polynomial functions. Thus we shall be concerned with 
the space C[ p, q] of polynomials in two commuting variables, which we denote by 
S. The system of equations (4.1) can be written in a very compact form using an 
operation called the Poisson bracket which is defined, for f, g < S, by 


of og of dg 

dp dq. aq op 
This is clearly a polynomial in p and q. The vector space S is a Lie algebra with 
respect to the Poisson bracket; but this will not be needed here. 


Returning to (4.1), an easy calculation shows that if we write x for the vector 
(p,q), the equations can be re-written in the form 


x = {H, x} (4.2) 
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where the bracket with H is calculated coordinatewise. At this stage this may seem 
just a little trick. In fact, the Poisson bracket is the algebraic counterpart of the 
symplectic structure that gives phase space its peculiar geometry; see [2]. More- 
over, this formalism guided Dirac in his formulation of quantum mechanics, as we 
shall now see. For more details about the Hamiltonian formalism see [26] or [1]. 


5 DIRAC. In the meantime, in Cambridge, Heisenberg mentioned his ideas on 
matrix mechanics at the end of his talk to the Kapitsa Club. One of the physicists 
present at the lecture was R. H. Fowler. In September 1925 Heisenberg sent the 
proofs of his paper to Fowler, who promptly handed them to his student Paul 
Dirac. Dirac looked at the paper but ‘at first could not make much of it’. Returning 
to it two weeks later, he realised that it ‘provided a clue to the problem of 
quantum mechanics’. Unaware of the developments in Germany, Dirac proceeded 
to work out his own version of quantum mechanics. 

Instead of interpreting the quantum variables as matrices, Dirac calculated with 
them formally. To use the Hamiltonian formalism he had to find an interpretation 
for the operation of differentiation with respect to a quantum variable, as we have 
already observed at the end of §3. Dirac’s solution was to point out that the 
quantum analogue of differentiation by q say, is taking the commutator with p. 
Thus, if f is a function of p and q in the quantum algebra, then df/dq = [p, fl. 

One of Dirac’s great contributions was his identification of the classical ana- 
logue of the quantum commutator. According to Bohr’s correspondence principle 
the results of quantum mechanics should converge to the analogous classical 
results when Planck’s constant tends to zero. This weird ‘constant tends to zero’ 
really means that the numerical value of the constant should be small when it is 
expressed in the units of action characteristic of the class of systems under 
consideration. 

Guided by this principle, Dirac discovered that the commutator divided by ih is 
the quantum analogue of the Poisson bracket of classical mechanics. The analogy 
allowed him to derive formula (3.1). Furthermore, defining H to be the Hamilto- 
nian of the system, and assuming that ‘the orders of the factors of the products 
Occurring in quantum motion are unimportant’ he wrote the fundamental quantum 
equation in the form 


x=[H,x], 


in complete analogy with (4.1). 

This analogy also helps to explain Dirac’s formula for differentiation by q. 
Indeed, if f is a polynomial in the (commutative) variables p and gq, one 
immediately checks from the formula of the Poisson bracket that {p, f} = of/dq. 
The corresponding quantum formula is obtained by replacing {, } with [, ]. 

As his papers show, Dirac clearly understood that the quantum mechanical 
quantities defined a new sort of algebra, for which the multiplication was not 
commutative. He later called these quantities q-numbers—as opposed to c-num- 
bers, which are the ordinary complex numbers. Dirac’s papers can be found in [27, 
p. 307 and 417]. 


6. QUANTUM ALGEBRA. Let us consider the algebraic background of the two 
interpretations of quantum mechanics that we have surveyed. Dirac assumes that 
he has ‘quantities’ that behave in a certain way. In other words, symbols that are 
subject to relations. In the one dimensional case two symbols p and q are required 
to represent momentum and position. They are related by pq — qp = if - 1—where 
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1 denotes the identity of the quantum algebra .Y. To avoid unnecessary complica- 
tion we normalize this relation to the form pq — qp = 1. 

Algebraically, Dirac’s quantum algebra is constructed beginning with the com- 
plex free algebra F in two generators x and y. The elements of the free algebra are 
linear combinations (with complex coefficients) of words in x and y. The product 
of two words is obtained by juxtaposition. The quantum algebra .Y is the quotient 
algebra of F by the two-sided ideal generated by xy — yx — 1. Thus p and q are 
the images of x and y in this quotient. 

Every element of is a linear combination of words in p and q—a property 
that .” inherits from its parent free algebra. Now, from [p, q] = 1 one deduces that 


lp, a‘ | _ kq*! and [p*, q| _ kp*-! 


This agrees with Dirac’s observation that commutation is analogous to differentia- 
tion. These commutation relations allow us to write every word in p and q as a 
linear combination of monomials q*p”. Thus every element of . is a linear 
combination of monomials of this form. 

Working a little harder, we can show that the monomials q‘p”, with k,m > 0, 
form a basis of .% as a complex vector space [6, Proposition 1.2.1]. This can be used 
to define the degree of an element of .Y. First define the degree of a monomial 
q‘p™ to be k + m. Now write d €.~ as a linear combination of monomials of this 
form: the maximum of the degree of these monomials is called the degree of d and 
is denoted by deg(d). 

The degree of . behaves in many ways like the degree of polynomials. For 
d,,d, €¥, 


(1) deg(d, + d,) < max{deg(d,), deg(d, )}, 
(2) deg(d,d,) = deg(d,) + deg(d,), and 
(3) degld,,d,] < deg(d,) + deg(d,) — 2. 


The proof of (1) is immediate, but the proof of (2) uses (3) and is somewhat 
convoluted. An immediate consequence of (2) is that .” is an integral domain: it 
does not have any zero divisors. See [6, Ch. 2, §1]. 

The degree can be used to prove several properties of .”. For example, in §2 of 
[7], Dirac characterizes all the derivations of .Y. Recall that a derivation D of ¥ is 
a C-linear operator of .Y that satisfies D(d,d,) = d,D(d,) + D(d,)d,, for every 
d,,d, €.¥%. With Dirac, we note that the order of d, and d, in the formula cannot 
be changed. An easy way to produce derivations of is to use the commutator. 
Given f €.¥, define P(d) = [d, fl, for d €.~% As one checks easily, this is a 
derivation of .Y. Derivations of this form are called inner derivations of ¥. 

Dirac showed that all the derivations of .” are inner. Let us sketch the proof. 
Let D be a derivation of . Since commutation by p and q behaves like 
differentiation, we can find f €.¥Y such that D(p) = [f, p] and D(q) = [f, q]. The 
actual calculation is reminiscent of the way one finds a potential function for a 
conservative polynomial vector field on the plane. Now, using induction on the 
degree of d €.¥Y, one can check that D(d) = [f, d]. 

Another very important property of .v is that it has no proper two-sided ideals, 
except zero. In other words, .Y is a simple algebra. However, it is not a division 
ring: p cannot have an inverse because, when multiplied by any element of .v it 
gives rise to an element of degree at least 1. Actually, the only invertible elements 
of . are the constants. 

The proof that 2 is simple goes as follows: suppose that J is a non-zero 
two-sided ideal of .” Choose a non-zero element d €.%. Commuting with p is 
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formally equivalent to differentiation by q. Hence commuting d with p enough 
times we obtain an element d’ €.¥ that does not involve q. But J is a two-sided 
ideal of .. Thus every time we commute an element of J with p, we get an 
element of J. Hence d’ € J. Now repeat the process with d’, this time commuting it 
as many times as necessary with q, until we arrive at a non-zero constant. Thus J 
contains a non-zero constant, and so J =.v; this is what we wanted to prove. For 
details see [6, Theorem 2.2.1]. 

We can also describe in a mathematical way the relation that Dirac found 
between the quantum commutator and the Poisson bracket. Quantum mechanics is 
represented by .Y, and classical mechanics is represented by the complex algebra S$ 
of polynomial functions on the variables p and q, which stand for momentum and 
position. Thus S is a commutative algebra. 

Let B, be the set of elements of w of degree <k and let S(k) be the set of 
homogeneous polynomials of degree k in S. We define a map o,: B, —> S(k) as 
follows: If d € B, has degree k, ignore the monomials of degree < k, and replace 
p by p and q by q in the monomials of degree k. This gives a homogeneous 
polynomial of degree k, which we denote by o,(d). For example, if d = q*p° + 
7q°p° + 6p° + 3pq, then d has degree 9 and o,(d) = g*p” + 7q°p°. This is called 
the symbol map of degree k of .v; it is a lmear map of vector spaces, no more. 
Note that if d& B, has degree <k then its symbol of degree k is zero. This 
construction is well-known from partial differential equation theory. 

Now to the relation with the Poisson bracket. Let d,, d, be elements of w of 
degrees k, and k, respectively. By (2), the commutator [d,, d,] has degree at most 
k, + k, — 2. One can now check that 


Ox, +k,-2([d, de ]) = {o,,(4,); o,,(d,)}. 


This is one way to express the relation discovered by Dirac. We will come across 
another way, more in the spirit of the correspondence principle, in $11. 


7. MATRIX REPRESENTATIONS. Let us now turn to the Heisenberg-Born- 
Jordan version of quantum mechanics. In it p and q are matrices. First of all notice 
that there cannot be two finite matrices whose commutator is 1. The easiest way to 
see this is to observe that the trace of a commutator is always zero, while the trace 
of the identity matrix is always non-zero. Therefore any such matrices must be 
infinite. 

Thus we are led to a representation of . into the algebra M,(C) of infinite 
matrices with complex coefficients. In other words, we must construct a homomor- 
phism of algebras of .” into M,{C). This is easy, given that we have defined .v as a 
quotient of a free algebra. It is enough to find two matrices P and Q in MC) 
such that PQ — QP = 1; for example 


First define a map 0: ¥ > MC) by 6(x) =-P and @(y) = Q. Since PQ — QP = 1, 
it follows that xy — yx — 1 belongs to the kernel of 6. Thus 6 induces a map 0: 
Wt — MC). But we have already seen that .v is simple. In particular, the image of 
ker(@) in . must be zero. Hence @ is injective. In other words, the subalgebra of 
M.{C) generated by P and Q is isomorphic to . 
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Thus we have two ways of describing the quantum algebra .v: as a quotient of a 
free algebra (Dirac’s way) or as a subalgebra of a matrix algebra (the Heisenberg- 
Born-Jordan way). A third scheme for doing quantum mechanics leads into yet one 
more description, perhaps the most fruitful, in terms of differential operators. 


8. WAVE MECHANICS. It has long been known that light shows phenomena that 
are better explained in terms of waves, and others that make better sense if it is 
thought of as a stream of small particles. Quantum theory reached a compromise, 
affirming a dual nature for light, both wave and particle. In 1924, Louis de Broglie, 
then a student working towards his doctorate in Paris, understood that the 
wave-particle dualism ought to be truly universal. If that were so, then a ‘particle’ 
such as an electron should also present the same dual nature of wave and particle. 
Langevin sent a copy of de Broglie’s thesis to Einstein, who wrote in reply: “he has 
lifted a corner of the great veil’. 

De Broglie’s work was the starting point of a third version of quantum mechan- 
ics, developed by the Austrian physicist Erwin Schrédinger in 1925. Schrédinger’s 
starting point can be best summed up in the aphorism where there is a wave, there 
must also be a wave equation. Actually this was reportedly said by P. Debye at the 
end of a colloquium in Zurich, in which Schrodinger explained de Broglie’s work to 
his department; see [23, Ch. 6]. Using de Broglie’s formulae and a reasonable 
heuristic argument, Schrodinger arrived at a very neat partial differential equation. 
For an electron of mass m moving in one dimension under a potential V, the 
equation is 


ih 7 Im aa + Vip. (8.1) 
At first this equation was hailed as a return to the good old days, with some 
physicists hoping that it would drive out those strange matrices and non-commuta- 
tive quantities. Moreover, it was deterministic. However, what did the function 
represent? It takes complex values, for a start. Max Born once again came to the 
rescue, and proposed that the wave function, as w came to be called, did not 
represent any physical quantity whatsoever. Only the square of its modulus 
|yw(x, £)|? had a physical interpretation. It represented the probability of finding an 
electron at x at the moment ¢. Despite much initial dispute, this became the 
accepted interpretation. 

This also rescued the uncertainty principle, which affirms that one cannot 
measure at the same time and with arbitrary precision, the position and momen- 
tum of a particle. In its mathematical form, it is a consequence of the quantum 
commutation relation (3.1). Since w& is a solution of a differential equation, it 
behaves deterministically. But y% cannot be measured. What one can measure is 
||, a mere probability. 

Let us spell out this scheme in more detail. The wave functions live in the space 
¥* of square integrable functions defined on the real line and taking complex 
values. This is a Hilbert space. In particular, it is endowed with an inner product; if 
wv €L? then 


Ch, Wr) = [ bite ae. 
R 
A wave function y must be square integrable because (w, w) is equal to the 


probability of finding the particle somewhere in the real line, which must be 1. 
However not every element of Y* is a wave function: wave functions must be 
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differentiable, if they are going to satisfy Schrddinger’s equation. The observables 
correspond to Hermitian operators on #7. Despite the name, these cannot be 
observed directly. The magnitudes that are observed correspond to the eigenvalues 
of these operators. Since the observables are Hermitian operators, the eigenvalues 
are real numbers, which is what one would expect of physical quantities. 

In the end, it turned out that Schrddinger’s wave mechanics is equivalent to the 
matrix formulation. Note that in both versions one has operators: matrices, in 
matrix mechanics; differential operators, in wave mechanics. But in matrix me- 
chanics, the matrices themselves change with time. The fundamental equations 
(3.2) relate the time derivative of a matrix with the quantum equivalent of the 
Hamiltonian. In wave mechanics, it is the wave function that changes with time. 
The differentiable operators act on the space of functions in the usual way. 

The connection between the two pictures comes through an operator U(t) that 
takes the wave function at ft, into the wave function at ¢, namely w(x, t) = 
U(t) w(x, to). It can be deduced from physical considerations that U(t) is a unitary 
operator. Let X, be an observable in wave mechanics. Mathematically we are 
talking about a Hermitian operator in Y*. Write X = X(t) = U(t)'X,U(d); this is 
the ‘matrix’ that corresponds to X, in matrix mechanics. Differentiating this 
formula with respect to ¢ and using Schrodinger’s equation, we arrive at 


X(t) = [H, X(4)], 


which is the fundamental quantum equation in Dirac’s form. For more details see 
[8, Ch. V, §28]. 

We can recreate the quantum algebra .v in the language of wave mechanics. 
This time we will be handling differential operators in #7”. The operators we want 
to consider are 0/ox and multiplication by x. For the sake of simplifying the 
formulae, let us denote these operators by 0 and x, respectively. If & is a wave 
function, then 


[a, x](y) = a(x) -— x0(p) = w. 
Since this holds for every w, we conclude that [09, x] = 1, the identity operator in 
¥*, Thus, proceeding as in §7, we can show that .” is isomorphic to the complex 
subalgebra of End,(.77) generated by the operators 3 and x. In wave mechanics, 
these operators correspond to momentum and position, as was to be expected. 


9. D-MODULES. We can represent the algebra . more economically as an 
algebra of differential operators if we use polynomial functions. Let us start in a 
little more generality. Let R be a commutative algebra over C. We define the ving 
of differential operators D(R) inductively as a subalgebra of End,(R). Since an 
element of R gives rise to a linear operator of R by multiplication, the inductive 
definition begins with O°(R) = R, the operators of order zero. The operators of 
order k are 


Q*(R) = {d € End,(R) : [d,a] €9*"'(R) for all a € R}. 


Let A(R) be the union of all M*(R) for k > 0. This turns out to be a subalgebra 
of End,(R), though the proof is not quite obvious, see [6, Ch. 3, §1]. 

It is easy to calculate D'(R) explicitly. It is generated, as an R-module, by 1 
and the C-derivations of R. In particular, if R = C[x], the polynomial ring in one 
variable, then D'(R) = R + Ro, where @ denotes the operator differentiation by 
x. Thus the quantum algebra .Y is contained in M(R) as the algebra generated by 
x and 3. Working a little harder, we can prove that . =Q(R); for details see [6, 
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Ch. 3, §2]. Hence the quantum algebra is the algebra of differential operators of 
the ring of polynomials in one variable. 

The preceding definition of rings of differential operators appears in 
Grothendieck’s Eléments de geométrie algébrique [14, proposition 16.8.8]. The 
notoriety of rings of differential operators nowadays is mainly due to D-module 
theory. A D-module is a finitely generated module over the algebra of differential 
operators of the coordinate ring of a smooth affine algebraic variety. To handle 
general varieties one must introduce sheaves [3, Ch. VI]. 

The importance of the theory lies in its numerous applications, which extend 
from mathematical physics to number theory. One of the most famous is to the 
representation theory of algebraic groups, where M-modules were used to settle 
the Kazhdan-Lusztig conjecture in 1981. A very important D-module theoretic 
theorem used in the solution of the conjecture is the Riemann-Hilbert correspon- 
dence. This is a result of the noblest parentage. Its genealogical tree includes 
Riemann’s memoir on the hypergeometric function, Hilbert’s 21st problem, and the 
work of Deligne on regular connections. 


10. THE WEYL ALGEBRA. It did not take long for algebraists to notice that the 
quantum algebra .Y was an interesting object of study. In 1933, D. E. Littlewood 
wrote a paper [18] in which he proves most of the properties of that we 
considered in §6. He also gives several examples of infinite matrices satisfying 
(3.1), among them the one of §7. 

Littlewood’s language is rather antiquated. But in 1937, K. A. Hirsch published 
a paper [16] in which he proves that a class of rings that includes » is a simple 
algebra. This is a thoroughly modern paper, written in the language of van der 
Warden’s Moderne Algebra. His approach is essentially the one presented in §6. 
See also [5]. 

A great boost to the study of .” came with the realization that it appears as a 
quotient of enveloping algebras of nilpotent Lie algebras by primitive ideals. This 
brought them into the fold of the representation theory of Lie algebras. 

In fact . is the first member of a family of complex algebras. It corresponds to 
a quantum system with one degree of freedom. The equations for systems with n 
degrees of freedom were found by Heisenberg himself, as early as September 1925. 
They also give rise to complex algebras that are simple integral domains. J. 
Dixmier studied these algebras in a series of papers in the 1960s, and was the first 
to call them Weyl algebras, after a suggestion of I. Segal; see [10]. He also 
introduced the notation A,(C) for the algebra corresponding to a system of n 
degrees of freedom; see [9]. Both the name and notation have become standard. 

The importance of the Weyl algebra has grown steadily in the last 30 years; see 
[3], [6], [20]. The work on non-commutative Noetherian rings that followed A. 
Goldie’s famous theorems on quotient rings of Noetherian rings [12], [13] and the 
fact that the Weyl algebra is the simplest (but quite typical) ring of differential 
operators has only added to its importance. 


11. DEFORMATIONS. It is time to return to quantum mechanics. The three 
schemes that we studied in §§6-8 give rise to the method known as canonical 
quantisation. First of all, by quantisation we mean the process of turning a classical 
system into its corresponding quantum system. This is not a well-defined process. 
In canonical quantisation one starts with the Hamiltonian H of the classical system 
and systematically replaces the classical variables position and momentum by the 
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operators x and d/dx of wave mechanics. One may now write the corresponding 
Schrodinger equation and solve it. 

Several other methods of quantisation have been proposed: geometrical quanti- 
sation, asymptotic quantisation, deformation quantisation. It is the last of these 
that we want to study here. It leads us into another way of describing the Wey] 
algebra: as a deformation of a polynomial ring. 

Let S be a commutative C-algebra. Denote by S[[t]] the space of power series in 
one variable with coefficients on S. Note that we are considering S|[t]] as a vector 
space only, and not as a ring. This is because what we really want to do is to define 
a new multiplication in S{[[t]]. To do that, we start with a family of bilinear maps 
B: SX S > S, for j = 0. If a,b € S, then their *-product in S{[t]] is 


axb= )) B(a,b)t!. 
j=0 


Extending this linearly to the whole of S[[t]], we obtain a multiplication in this 
space. The multiplication is associative if the B; satisfy 


» B(a,B(b,c))= | B(B(a,b),c) 


i+j=k itj=k 


for k = 0 and all a,b,c € S. This is not very easy to check for a given family of 
bilinear maps. Doing it recursively, one is led to consider Hochschild homology, as 
shown by M. Gerstenhaber in [11]. We do not pursue this line here; our aims are 
more modest. 

Two further assumptions are usually made. Since we want the *-product to be a 
deformation of S, we must have B,(a, b) = ab, the original product in S. If the 
identity of S is to be the identity of S{[t]] with the product *, then we must also 
have Ba, b) = 0 for j > 0 if either a or b is a scalar. 

Let us return to quantum theory. We have seen that one of the key features of 
Dirac’s approach to quantum mechanics was the relation between the classical 
Poisson bracket and the quantum commutator. He arrived at this relation using the 
correspondence principle, which states that a quantum system should tend to its 
classical analogue when Planck’s constant tends to zero. This is also the starting 
point of the deformation theoretic approach to quantum mechanics. 

In this approach we begin with the classical phase space. Since we are consider- 
ing only a particle moving in a straight line, phase space is a two dimensional 
space. The classical dynamical variables are functions on phase space, and we are 
assuming that they are polynomial functions, to keep the going easy. The same is 
true in the deformation theoretic scheme. So far, so good. What we have to define 
anew is the multiplication of these observables. Furthermore, it must somehow 
depend on Planck’s constant. 

So let S = C[ p, g] be the polynomial ring. We define a new product in S[[A]], 
the space of formal power series in f, using the deformation theoretic approach 
just described. But what does it mean to say that the “commutator corresponds to 
the Poisson bracket’? Let f, g < S. Suppose we have constructed a deformation of 
S{[h]] given by a family of bilinear forms B,, for j => 0. Forming the commutator of 
f and g as elements in the ring S[{[%]] with this product, we get 


feg-gxf= L (Bh a) ~ Bg, f))h’. (11.1) 
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Since B,(f, g) = fg, the first non-zero term of the power series in (11.1) is 
(Bf, g) — Big, f))h%. But, according to Dirac, the commutator fx g—gxf 
divided by if ought to be equal to the Poisson bracket when f goes to zero. Thus 
Bf, g) — Bg, f) = iff, g}. An easy way to achieve this is to require that B,(f, g) 
= i{f, g}/2, since the Poisson bracket is skew symmetric. 

As we saw in §4, the Poisson bracket is really a bidifferential operator in the 
arguments f and g. Thus we may boldly propose to extend this assumption to all 
the Bs. The question, of course, is: can one define a *-product in S[[/]] satisfying 
all these conditions? 

The answer is yes. This *-product is called the Moyal-Weyl product. It was used 
by Moyal in [24] to study quantum statistical mechanics from the point of view of 
classical phase space. This product can be described in a very compact way if we 
use tensor products. First define the differential operator Il: S$ @ S > S @& S by 
II(f ®@ g) = of/dp ® dg/dq — dg/dp ® of /aq. Now let A: S @ S > S be the 
multiplication map A(f ® g) = fg. One checks easily that the Poisson bracket can 
be written using II and A as {f, g} = AII(f ® g). More generally, the Moyal-Weyl 
*x-product of f and g is fxg = Alexp(iAlID(f ® g)). As an example, let us 
calculate the coefficient of the term in %* of fxg. By definition it is — A(II*)(f ® g). 
An easy calculation shows that this is equal to 


a°f a°g a°f a°g  a*f a°g 


_ 4 . 
Op” oq’ Opdq dpoq -aq* op’ 


In particular, if either f or g has degree < 1 then this term is zero. 

More generally, if either f or g has degree <k, then ACII**!(f ®@g)) =0. 
Thus fxg is indeed a polynomial. Moreover px q — qx p = ih{p, q} = ih. 
Normalizing ih to 1 we see that the algebra S with the Moyal-Weyl *-product is 
isomorphic to .~. This is not quite a proof that these two algebras are isomorphic, 
because we have not verified that the Moyal-Weyl product is associative. We shall 
not check this here, but leave it, instead, to the conscientious reader as an exercise. 
For details see [25] and [28]. 

The deformation theoretic approach to the Weyl algebra is also interesting from 
an algorithmic point of view. Calculations with elements of the Weyl algebra are 
not exactly easy. The multiplication of two monomials of relatively small degree 
may give rise to a long string of terms. This is awkward to implement in a 
computer. The *-product approach bypasses all this and gives a closed formula in 
terms of differentiation of polynomials, a calculation that computers can handle. 


12. CONCLUDING REMARKS. G. H. Hardy says in A Mathematician’s Apology 
that ‘a mathematical idea is ‘significant’ if it can be connected, in a natural and 
illuminating way, with a large complex of other mathematical ideas’ [15, §11]. 
Having examined the evidence collected in the preceding sections, we can safely 
say that, by Hardy’s criterion, the Weyl algebra is a ‘significant idea’. This explains 
why it has been studied so intensely. A lot is known about the one-dimensional 
Weyl algebra x. Its right ideals have been classified in [4] and [17], and its 
representation theory has been studied very thoroughly [19]. The same cannot be 
said of the many-dimensional Weyl algebras mentioned in $10. 

But even .v still hides some secrets. For example, it is not known whether all 
endomorphisms of . are surjective. This first appeared in print as ‘Probléme 11.1’ 
in [10]; and it is closely related to the famous Jacobian conjecture; see [6, Ch. 4, $4]. 
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Multiple Integrals of Symmetric Functions 


Tiberiu Trif 


In this paper we illustrate an unitary treatment of a class of multiple Riemann 
integrals. Let a and b be real numbers such that a < b. In the sequel C,(a, b) 
denotes the n-dimensional cube [a, b]", while D,(a, b) denotes the set of all points 


(x,,...,%,) € R” such that a<x,< «+: <x, <b. A function F from 
C,(a,b) to R is called symmetric if F(x,,...,%,) = Fxgqy)--+.Xo(ny) for all 
(x,,...,X,) € C,(a, b) and all permutations o € S,, the symmetric group of order 


n. The main result to be used is: 


Theorem. Let a and b be real numbers such that a < b, and let F be a Riemann 
integrable symmetric function from C,(a, b) to R such that for alli € {1,...,n} and 
all (x,,...,x,) € C,(a,b) the function F(x,,...,%;-1 °X;41)---,%X,):la,b] > R is 
integrable. Then 


fo FO en) beyrde, = nlf Fay, tq) Bey dy. 
C,fa, b) D(a, b) 


Proof: For any given 0 € S,, put 


D, = {(X15--+5X_) €C, (a,b) |4 < Xo) SS Xoqy < OF, 


a(n) 


oC 


I= [ F(2y,--+5%,) Bey dy. 
D, n n 


Let e be the identity permutation in S,. Taking into account that F is symmetric 
and applying Fubini’s theorem we obtain 


b Xog(n) X 5 (2) 
I, = J din AX 5 (n 1) of F(xX1,X2,---+,Xy) AX, (1) 
Qa Qa Qa 


b Xo(n) Xo (2) 
= J ee AX 5 ¢n-1) _ J F(x,q)> Xo (Q)r++9 Xen) AX, (1) =I, 
Qa a Qa 


for all o € S,. Hence we have 


n! f F(x,,...,X,) dx, 
D,fa, b) 


=r'!L= ¥, I, =f 


F(x,,...,X,) de,"de,,. a 
oes, C,{a, b) 


Example 1 [4, Problem 67, p. 61]. Let f: [a,b] — R be an integrable function. 
Then 

1 b ” 
fo fla) Fy) dee ndey =] fC) ae 
D,fa, b) nN! |%q 


Solution: Apply the theorem to F: C,(a,b) > R defined by F(x,,...,x,) = 
fx) f%,). 
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Example 2 [4, Problem 61, p. 60]. 


7? 
If In|sin(x — y)|dedy = — —In2, 
D 2 


where D = {(x, y) € R?|0 <x <y < ah. 
Solution: The theorem gives 
1 
lf In|sin(x — y)|dedy = ~IT, 
D 2 
where 
[ := "1 In| sin( x — dxdy = (74 7" 4 "(24 “ee 
[fining = »ylacay = (FPF + [ET + (PE OE 
In the second integral we make the substitution x = u, y = 7/2 + v, in the third 
the substitution x = 7/2 +u, y =v, and in the fourth the substitution x = 
7w/2+u, y = 7/2 + v. Thus we obtain 
[=2 2 7Ip|sin u —v)|dudu + 2 z 7 In| cos u — v)|dudv 
[7 [Pialsin(u — v)|dudo + 2f7 [*Injcos(u — v) 


a px |sin(2u — 2v)| Am 7 
_ 2 [2 _ 2 [2 : _ _ 
2f i In 5 du dv 2f i In|sin(2u — 2v)|dudv — —In2. 


Using the substitution s = 2u, t = 2v we obtain J = I/2 — (a7ln2)/2. Hence 
[= —q71n2. 


Example 3 [3, Problem 2.1, p. 49]. Evaluate 


ff (in x — In yle °* dedy. 


Solution: Let I denote the value of the integral, so J = lim, _,,,/(a), where 
a pa 
I(a) = i i [In x — In yle “*” deady. 
0-0 
The theorem gives 
a 
I(a) = 2/ (fans —In yee ayl dx 
0 \Y0 


= aff —e*)Inx-e™ fern vdy| dx 
0 0 


=2fe*(1—e*)Inxdx — 2f e*! [ eIn ydy} de. 
fend -eryneae—2ffer|[etn 
Integrating by parts in the second integral we get 
I(a) =2f ein xdx — 4 e72*In xdx + 2e7* [ e*In xde. 
(a) = 2, J, J, 
Since 


lim e ° [oem inxde = Q, 


am 0 
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we obtain 


[= 2f e*Inx dx — af ein x dx. 


Making the substitution 2x = ¢ in the second integral we get 


oe) oO t 
[= 2/ e-*In xdx — 2/ eIn5 dt = 21n2. 
0 0 


Example 4 [1, p. 27]. 


ih _ [OMCs n) dx-dx, = 


where M(x,,.. 
smallest). 


Solution: We have 


i x, dx,--dx, 
D0, ) 


ine ae 
0 °0 0-0 


0 


.,X,) denotes the rth largest of x,,.. 


n+1 


fox, dx,:--dx, 


1 1 ¢*n Xr+1 
~ oor, i of x, dX, AX. Ky 
But, on the other hand, the theorem ensures that 
[le PM Cas ry) ded, = at fx, deyde, = : 
0 0 D,(0, 1) n-+1 
Remark. In the special cases r = 1 and r = n we obtain 
ik vs fomin(x,,..., x,) dt,::dx,, = 
0 0 ” ” n+1 
and 
[oom Pimax(ay sx ) dx, de, = a ; 
0 0 " "— n+1 
respectively. 


Example 5. Let a,,a),.. 
Then 


ay, 2,2... ,242 -2,2..,,242 2,2 
-f e Max {43.43% a, Xy, 4143 °° a,x) yeeey ajay 
0 
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., a, be positive real numbers, and let a = a,a,°:-a 


gh, x2 
ania) dy + de, dx, 


.,X, (starting with the 


n° 


607 


Solution: Making the substitution @,a,;:---a,x, = y,, @,43°°'d,X. = Vo,...-, 
a,a,°*'a, 1X, =y, Wwe obtain 


4, 742 an, 2,2....2.2 2.2... .2.2 2.2...2 2 
eee max {A5A3 °° AL XT, AT AZ ALXF,. 64, ALAR" AL_ 1 Xp} eee 
Jf [ve dx,,-°'dx,dx, 
0 0 0 


| 
ban) 
=e 
= 
S 
~ 


2 2 
= Aol emaArtyir oo Iddy + -dy, —_ 
a C,(0, a) a D,(O, a) 


n! a yn y2 29 n! 1 a 2 
= eee end sd —— eee 8 OO OO n—l Yn . 
gt} im} i Y 1 Yn gta (n _ pid € Yn 


Remark. In the special case n = 2 we get 


a rb 2.2 .2,2 2 rab 5 
max{b°x", a°y"} hy dy = — x dy = 
i i “ 4 ab J, xe ab 
This was a problem in the 50th William Lowell Putnam Competition, 1989 [2]. 


Example 6 [5, p. 893]. Let n be a positive integer, n > 2, and let b €]0, 1]. Then 
b b nb — b” 
fi ffoin| 1 —..., =| dx,--dx,, = ———. 
0 0 x4 x n—1 


Solution: 


b 1 Xn x5 b 
=n! mini l,—,..., —} dx, --: dx, =n! ve min; 1, — } dx,:--d, 
Jon | xy ~ | JJ J, | ~ | 
n! 1 b b 1 nb — b” 
_ n—1 ‘ ___ —_ n—-1 n—-2 _ 
“Gopi, min, dx, n fix de + nb | x dx od 


REFERENCES 


1. M.S. Klamkin and D. J. Newmann, Inequalities and Identities for Sums and Integrals, Amer. 
Math. Monthly 83 (1976), 26-30. 

2. L. F. Klosinski, G. L. Alexanderson, and L. C. Larson, The Fiftieth William Lowell Putnam 
Mathematical Competition, Amer. Math. Monthly 98 (1991), 319-327. 

3. B. M. Makarov, M. G. Goluzina, A. A. Lodkin, and A. N. Podkorytov, Selected Problems in Real 
Analysis, Transl. of Math. Monographs, Vol. 107, Amer. Math. Society, Providence, Rhode Island, 
1992. 

4. G. Polya and G. Szegd, Problems and Theorems in Analysis I, Springer-Verlag, Berlin-Heidelberg- 
New York, 1972. 

5. K. B. Stolarsky, From Wythoff’s Nim to Chebyshev’s Inequality, Amer. Math. Monthly 98 (1991), 
889-900. 


Universitatea Babes-Bolyai 

Facultatea de Matematica si Informatica 
RO-3400 Cluj-Napoca, Str. Kogalniceanu Nr. 1 
Romania 

ttrif@math.ubbcluj.ro 


608 MULTIPLE INTEGRALS [August-September 


Some Inequalities for Principal Submatrices 


John Chollet 


1. INTRODUCTION. Let A be in M,(C), the set of all n-by-n complex matrices, 
and let w be a nonempty subset of {1,2,...,n} with its elements listed in 
increasing numerical order. Denote by A[w] the principal submatrix of A whose 
entries are in the intersection of those rows and columns of A specified by w. If 
the matrix A is positive definite Hermitian, then so is A[w]. A known matrix 
inequality that holds for all positive definite A and all w is ((6], [7], [8], [13, 
Theorem 7.7.8]) 

A'[w]> Alo], (1) 
where the inequality denotes the positive semidefinite ordering (Loewner order- 
ing) of pairs of Hermitian matrices A and B in which A > B if A — B is positive 
semidefinite [13, Def. 7.7.1]. 

In this article, we generalize inequality (1) in two consecutive steps. The first of 
these is the substitution of a primary matrix function for the matrix inverse 
appearing in (1). In Section 2, we recall the definition of primary matrix functions, 
give examples, and obtain a general property involving matrix monotonicity. In 
Section 3, we discuss matrix convexity, obtain two families of inequalities, and 
survey some results of Chandler Davis. One of his results states that matrix 
convexity of all orders is a necessary and sufficient condition on a primary matrix 
function so that it may replace the matrix inverse in (1). Weakening the condition 
to convexity of a fixed order leads to an open question posed by Davis. 

The next step in the generalization of (1) is to replace the idea of “extracting a 
principal submatrix” with a more general mapping from M,(C) to M,(C). In 
Section 4, we consider normalized positive linear maps on matrices and review 
some of the work done by Ando, Choi, Davis, and Kadison. Most of their results 
were stated in the context of C*-algebras, but our description is matrix-theoretic. 


2. PRIMARY MATRIX FUNCTIONS AND MONOTONICITY. If f is a polyno- 
mial function in a variable x and A is a square matrix, it is not difficult to give a 
definition of the matrix function f(A) by simply requiring that A be substituted 
for x. This is exactly what is done in the Cayley-Hamilton Theorem. Similarly, in 
the case of an entire analytic function f, its power series can always be used to 
define f(A). Perhaps the best known matrix function of this type is the exponen- 
tial. 

Even if f is not analytic, a definition of f(A) exists that agrees with the 
previous ones for appropriate functions and matrices. Assume f is a real- or 
complex-valued function whose domain contains the eigenvalues A,, A,,..., A, of 
a diagonalizable matrix A and suppose A is diagonalized as A = 
Udiag(A,, A>,..., A4,JU~*. Then the primary matrix function associated with the 
stem function f is 


f(A) = Udiag( f(y), f(A2), +5 fOAn))U- (2) 
Proving that f(A) is well-defined, despite the non-uniqueness of a diagonalizing 
factorization of A, is a straightforward application of Lagrange’s interpolation 
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formula [14, pp. 407-408]; the basic property is that f(S4S~') = SfCA)S~! for any 
nonsingular S [14, Theorem 6.2.9(c)]. For Hermitian A, the matrix U may be taken 
to be unitary (U~' = U*, the complex conjugate transpose of U). The notation 
f(A) is used with the tacit understanding that the eigenvalues of A are in the 
domain of f. 

Since the matrices considered in this paper are Hermitian (and hence diagonal- 
izable), the definition given in (3) is sufficient for our purposes. However, a 
primary matrix function whose domain is all complex matrices with eigenvalues in 
a given set can be defined by using the Jordan canonical form and by requiring 
that its stem function satisfy certain differentiability conditions; this definition 
reduces to (2) on diagonalizable matrices [14, Section 6.2]. 

It follows immediately from (2) that if f is a real-valued function and A is 
Hermitian, then f(A) is also Hermitian. For f(A) to be positive semidefinite or 
positive definite, f must be nonnegative or positive, respectively, on the eigenval- 
ues of A. Now, for f(x) =x~' and any positive definite A, A~' = f(A) and 
inequality (1) becomes 


f(A)[o] = f(Ale@]). (3) 


As we shall see, other functions for which inequality (3) holds are x? and x '/”, 


while the reverse inequality holds for x'””. In the next section, we demonstrate 
that (3) holds only if f is a convex matrix function of order n and that if f is a 
non-constant monotone matrix function of order n > 2 then inequality (3) for f—! 
implies the reverse inequality for f. Moreover, if A is a rank one Hermitian or 
positive semidefinite Hermitian matrix, then the inequality is true for all of the 
functions f,(x) =x°, a = 1, with some restrictions on a in the Hermitian case. 
The rest of the discussion in this section is devoted to verifying these assertions 
and to introducing the concept of matrix monotonicity. 
We begin with a matrix version of a result of Kadison [15, Theorem 1]. 


Theorem 1. Jf A is Hermitian, then A*[w] > AloF. 


Proof: Let P be a permutation matrix such that PAP* can be partitioned as 
follows: 


“ec 


B* C 
Then 
P42P* = Alo] By|Alo] B _ A[w]’ + BB* A[ w]B+ BC 
Be Ci| Bt C} | Bt4[o]+CB* BB*+CC | 


If w has k elements and ow’ = {1,2,...,k}, then, for any n-by-n matrix E, 
(PEP* i w'] = E[w]. Thus, A’[w] = A[w]? + BB* > A[ol’. a 


It is well-known that if A and B are positive semidefinite and A > B, then 
Al’? > B‘/*; a very short proof is given in [8, p. 199]. Moreover, the positive 
semidefinite square root of a positive semidefinite matrix is unique. Using these 


facts, one obtains the following corollary: 


Corollary 1. If A is positive semidefinite Hermitian, then A[w]'/* > A'’*[ a]. 
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Proof: Note that A[w] = (A'””)*[@] > (4'/”*[@)?. Taking square roots, we find 
that A[w]'”* > (A!7[o))?)!? = A!7[a]. a 


It is also well-known that if 4 >B> 0, then B~' > A7! (8, p. 198] or [13, 
Corollary 7.7.4]). Applying this result to the inequality in Corollary 1, one gets 
(Al, aw)! = CA[@]'/7)~!. Inequality (1) and transitivity conclude the proof of 
the next corollary. 


Corollary 2. If A is positive definite Hermitian, then A~\/*[w] => A[w]7'/” 


In the next section, we obtain two families of inequalities that contain the 
preceding corollaries as special cases. 

Loewner in [17] introduced a partial order on the set of Hermitian matrices, 
defined monotone matrix functions in terms of this ordering, and gave necessary 
and sufficient conditions for a matrix function to be monotone. A real-valued 
function f on a real interval J is said to be a monotone matrix function of order n 
on I, [14, p. 536] if f(A) = f(B) whenever the n-by-n Hermitian matrices A and B 
have eigenvalues in J and satisfy A => B. If f is matrix monotone for all n, it is 
called operator monotone. Loewner proved that a primary matrix function is an 
operator monotone matrix function on / if and only if its stem function f is the 
restriction to J of an analytic function on the upper half-plane {z € C: Imz > 0} 
that maps the upper half-plane into itself and has finite boundary values [14, p. 
541]. This Loewner mapping criterion can be used to show that the functions x for 
0<a<1, log x, and —x~! are operator monotone on (0,%) (problem 20 of 
Section 6.6 of [14]). 

Conditions for matrix monotonicity (and convexity) can also be given in terms of 
suitable generalizations of the divided differences appearing in Newton’s interpo- 
lating polynomial; since notation varies, we use that of [14, Section 6.1] for ease of 
reference. If f is a function whose domain contains the distinct real or complex 
numbers ¢,,f,...,¢,, then Newton’s interpolating polynomial (of degree at most 
n~-—1) is p,_,¢) = f(t,) + Af(t, tC — t,) + A(t, to, tt - tt - t,) 
+o +A" lft, t,,...,¢,¢ —t))-@—t,_,), where A’f(t,,t,,...,t.1) = 
Vf U 2, j4 At, -—t)°' for 1=1,2,...,n-1. The divided differences 
A'f(t,,t,,...,t)4,) can still be defined even if the points ¢,,t,,...,t, are not 
distinct as long as f is analytic on a simply connected domain with these points in 
its interior or f is continuously differentiable on an real interval containing these 
points [14, Section 6.1.14]. For a continuously differentiable stem function f, the 
associated matrix function is a monotone matrix function of order n on J if and 
only if the n-by-n Loewner matrix [Af(,, t;)] is positive semidefinite Hermitian for 
all t,,t,,...,¢, in J [14, p. 536]. 

The stem function of a monotone matrix function of order n > 2 on [0,~) is 
either strictly increasing, and so invertible, or constant (problem 9 in Section 6.6 of 
[14]). This fact and the proof of Corollary 1 for f(x) =x'/? and f-'(x) =x? 
suggest the next theorem. 


Theorem 2. Let f be a monotone matrix function of order n = 2 on [0,%). Suppose f 
is not a constant function on [0,~%) and suppose A = 0. If f-'CA)Lo] = f 'CALoD, 
then f(A[w]) => f(A a]. 


The Spectral Theorem ({13, Theorem 4.1.5] or [20, p. 223]) ensures that a 
Hermitian matrix A of rank one can be written as A = Ayy*, where A is the 
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nonzero eigenvalue and y is a unit eigenvector corresponding to this eigenvalue. If 
f(A) is defined, then fCA) = f(A) yy* for any such y. 


Theorem 3. Let a> 1 and let A be a rank one Hermitian matrix. The inequality 
A*[@w] > Al@]* holds for all w if either (a) (—1)*=1 or (b) A is positive 
semidefinite. 


Proof: If A = Ayy*, then A®* = A“yy*. Let w = {i,,i,,...,i,}. Writing ylw] for 
the k-by-1 column matrix whose rows are the rows numbered 1,,i,,...,7, of y, 
one gets A°*[a] = (A%yy* Lal] = A*yl@ly*[@]. If y[w] = 0, then there is nothing 
to prove because A[w] = A*[@] = 0. Otherwise y[w] is an eigenvector of Ala] 
corresponding to its only nonzero eigenvalue AB with B = y*[o@lyl@] < y*y = 1. 
It is then possible to write A[w] = ABC B 'y[wly*[e)) and, by the remark before 
the theorem, A[w]* = A°B“( Bt ylwly*[w). Now 


A*[o] — A[o]” = Aty[oly*[o] - ase 'yloly*[o] 
=A*(1— B**)yL@ly*[o], 


which is positive semidefinite in either case (a) or (b) because B*' < 1. a 


Writing f-'(x) =x*% and f(x) =x'/%, one obtains the following as a corollary 
of Theorems 2 and 3 and the fact that x!“* is a monotone matrix function for 
a> 1. 


Corollary 3. [fA is a rank one positive semidefinite Hermitian matrix and a = 1, then 
Al w]/* > A’ *[ a] for all a. 


3. CONVEX AND MATRIX CONVEX FUNCTIONS. A real-valued function f 
defined on an interval J of real numbers is convex on I if for all A,, A,,..., A, in 
I, the inequality f(u,A, + u,A, ++: +u,A,) <u, fCdA,) + u,fQ,) +: +u,fQ,) 
holds for all u,,u,...,u, in [0,1] with u, +u,+-:- +u, = 1. Sometimes, for 
precision, we refer to this type of convexity as ordinary convexity. If the second 
derivative of f exists on a nonempty open interval J, then the nonnegativity of this 
derivative is necessary and sufficient for f to be convex on I. 


Theorem 4. Let n > 1 be a positive integer and f a real-valued function on a real 
interval I. Fix i € {1,2,...,n} and set w = {i}. Then f(A)[w] = f(A[@]) for every 
n-by-n Hermitian matrix A whose eigenvalues lie in I if and only if f is convex. 


Proof: Let 1, A5,...,A, in I and let u,,u,,...,u, in [0,1] be such that u, + u, 
+ --- +u, = 1. Construct a unitary matrix U whose ith row is [Vu . fur» Leey Yun | 
and consider the Hermitian matrix A = Udiag(,,, d,,...,A,)U*. Now, f(A a] 
= Vii flu; and fCAL@) = f(L7_, A;u,). Therefore, the inequality in the theo- 
rem implies the convexity of f on J. Conversely, if f is convex on J and A 1s 
Hermitian with all its eigenvalues A,,A,,...,A, in J, then A = 
Udiag(A,, A,...,A,JU* for some unitary U = [u,,]. Then Al[w] = rr, Ajlu,| 
and f(A)[o] = aa lu,” But vr, lu;;\7 = 1, so f(Alla] > f(L7_, Alu, 
= f(ALo). / 


But is ordinary convexity of f sufficient for inequality (3) to hold for all w? The 
following example shows that this is definitely not the case. 
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Example 1. Consider [6, p. 568] 


1 1 1 
Then A is positive semidefinite. The function f(x) = x* is convex. With w = {1, 2}, 


4 4 8 5 
Aw] — lw =|8 3], 
which is not positive semidefinite. 

To give a sufficient condition for inequality (3) to hold, we need to introduce 
convex matrix functions [14, pp. 543-544]. If A and B are n-by-n Hermitian 
matrices with eigenvalues in an interval J and ¢ is in [0,1], then (1 — t)A + ¢B is 
also Hermitian with eigenvalues in the same interval J. A real-valued function f 
on I is a convex matrix function of order n on I if (1 — t)fCA) + f(B) = f( — t)A 
+ tB) for all such A and B and all ¢t in [0,1]; the function f is called operator 
convex on I if it is a convex matrix function of order n on J for all n. Note that 
matrix convexity of order 1 is ordinary convexity. Concavity of matrix functions is 
defined by reversing the inequality; f is concave if and only if —f is convex. 

Loewner’s student, F. Kraus, initiated the study of convex matrix function and 
proved the following result ((16] or [14, p. 547]). If f is twice continuously 
differentiable on J, then f is operator convex on / if and only if the matrix 
[A’f(t,, t;, t,)] is positive semidefinite for all t,,¢,,...,¢, in J. Examples of opera- 
tor convex matrix functions are x*, —x'/7,x~!/* and x7!; e* is matrix convex only 
of order 1 [14, Section 6.6, Problem 31, p. 555]. 

Matrix convexity and monotonicity are related by a theorem of Bendat and 
Sherman ([3, Theorem 3.3] or [14, p. 547]): A twice continuous differentiable 
function f on J is operator convex on J if and only if Af(t,t,) is operator 
monotone on / for all ¢, in J. This criterion can be used to prove that the 
functions x* for 1 < a <2 or for —1 < a < 0 are operator convex on [0, ©) or 
(0, °°), respectively. Indeed, the proofs are similar to the one showing the functions 
—x* for 0 < a < 1 have the same property (see problem 17 of Section 6.6 of [14] 
or Corollary 4.1 of [1]). 

Some relatively recent results relate operator monotonicity, convexity, and 
concavity in an interesting way. Using Loewner’s integral representation [14, p. 
542] of operator monotone functions, Ando [1, Theorem 4] proved that an 
operator monotone function on (0,%) is also operator concave there; see [11, 
Theorem 2] for a different approach to this result. Finally, Mathias [18, Theorem 
2.1] generalized this result by noticing that the proof in [11] shows that any matrix 
monotone function of order n on (0, ©) is also matrix concave of order [n /2] there. 

Convex matrix functions of order n and inequality (3) are linked in a result of 
Chandler Davis [8, Theorem 5]. 


Theorem 5 (Davis). If f is a convex matrix function of order n on a given interval I, 
then f(A). w] => fC Al) for all w C {1,2,..., n} and all n-by-n Hermitian matrices A 
with eigenvalues in I. 


Davis actually showed that the hypotheses of this theorem imply Pf(A)P = 
Pf(PAP)P for any n-by-n Hermitian projection P and any n-by-n Hermitian 
matrix A with eigenvalues in J. Davis called this the Sherman condition on f of 
order n. Although this condition seems to be stronger than that of Theorem 5, it is 
not. 
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Lemma 1. Let A be a given n-by-n Hermitian matrix with eigenvalues in an interval I. 
Then f(A)[w] = fCAl@)) for all w C {1,2,..., n} if and only if Pf(A)P = Pf(PAP)P 
for any n-by-n Hermitian projection P. 


Proof: If w = {i,,i,,...,4,} and P is the n-by-n matrix whose i; column has a 1 in 
row I; for j = 1,2,...,k and all other entries are zero, then P is a Hermitian 
projection and PAP has Al[q@] in the intersection of the rows and columns 
numbered by w and zeros everywhere else. Therefore, the Sherman condition 
implies inequality f(A)[w] = fCA[@)) for all w c {1,2,...,n}. Now assume that 
this inequality holds for all w and recall that V*fCA)V = fV*AV) for all unitary 
matrices V. Let P be any Hermitian projection and V a unitary matrix such that 
V*PV=L=I1,00,-_,, the direct sum of the k-by-k identity matrix and the 
(n — k)-by-(n — k) zero matrix. Let w = {1,2,..., k}. Then 


Pf(.A) P = VLV*f(.A)VLV* = VLf(V*AV) LV* 
=V((f(V*AV)[@]) © 0,_,)V* 
> V(f((V*AV)[@]) © 0,_,)V* = VLE(LV*AVL) LV* 
= VLf(V*PAPV) LV* = VLV*f(PAP)VLV* = Pf(PAP)P. 


Using Theorem 5 and operator convexity of the functions f(x) = x*%, we can 
obtain two families of inequalities for principal submatrices that include some 
already considered. 


Theorem 6. The inequality A°[ @] > A[w]* holds for all w if either (a) -1 < a < 0 
and A is positive definite or (b) 1< a<2 and A is positive semidefinite. The 
inequality A[ w|* > A*[@] holds for 0 < a < 1 if A is positive semidefinite. 


Part (a) is implied by the second matrix inequality by taking inverses and using 
inequality (1). Also, the second inequality for $< a < 1 can be obtained from part 
(b) by using monotonicity and Theorem 2. 

By Theorem 3, the Sherman condition of order n is satisfied by the function x° 
on the set of positive semidefinite Hermitian matrices of rank one, but this 
function is not a convex matrix function for any n > 2 (see problem 30 of Section 
6.6 of [14]). For n > 1, is there a matrix function that satisfies the Sherman 
condition of order n but is not matrix convex of this order? This is part of a 
question posed by Davis [8, p. 197]. His question is also related to the following 
theorem from [8]: 


Theorem 7 (Davis). If f satisfies the Sherman condition of order 2n, then f is a 
convex matrix function of order n. 


For n > 1, it is not known if there is a function that is matrix convex of order n 
but does not satisfy the Sherman condition of order 2n. Similarly, necessary and 
sufficient conditions on a matrix function for inequality (3) to hold for a fixed n 
and all w have not yet been found. 

If matrix convexity of a fixed order is strengthened to operator convexity, then 
Davis [9] obtained the following theorem. 


Theorem 8 (Davis). .A matrix function f is operator convex on an interval I if and only 
if f satisfies the Sherman condition of order n for all n. 
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4. NORMALIZED POSITIVE LINEAR MAPS. Consider the map ®,,(.A):M,(C) 
— M,(C) defined by ®, = A[@]. This map has three obvious properties: (1) it is 
linear; (2) if A is positive semidefinite Hermitian, then so is ®,(A); and (3) 
®,(/,) =1,, where I, and J, are the n-by-n and k-by-k identities. Any map 
®:M(C) > M,(C) with these three properties is called a normalized positive linear 
map. Linearity of ® implies that if A > B then ®(A) = P(B) and that ® maps 
Hermitian matrices to Hermitian matrices (write a Hermitian A as the difference 
of two positive semidefinite matrices). Moreover, if the eigenvalues of the Hermi- 
tian matrix A are in (a, b), then so are the eigenvalues of ®(A). Indeed, if A > a 
is the smallest eigenvalue of A, then A — AI, > 0 and ®(A — AI) = OCA) - 
AL, = 0 or BCA) = AI; therefore, the eigenvalues of PCA) are greater than a. A 
similar calculation with yl, — A = 0, where y < b is the largest eigenvalue of A, 
shows that the eigenvalues of ®(A) are smaller than b. In particular, ® maps 
positive definite matrices to positive definite matrices 

If we replace the extraction of principal submatrices by a normalized positive 
linear map in (3), we obtain the inequality 


®( f(A)) = f(®(A)). (4) 
For what primary matrix functions f and for what matrices A does (4) hold for all 
®? In the setting of C*-algebras, Kadison [15] obtained a “Generalized Schwarz 
Inequality”, which shows that (4) holds for f(x) = x*, Hermitian A, and linear 
maps ©® that preserve the Loewner order and are contractions. Choi [6, Theorem 
2.1] showed that (4) holds if ® is a normalized positive linear map, f is operator 
convex on a symmetric interval J = (—a, a), and A is Hermitian with all eigenval- 
ues in J. It follows easily that the same is true for any interval J = (a, b). This then 
gives a nice version of Theorem 5 for normalized positive linear maps. 


Theorem 9. If f is an operator convex function on a given interval I and if ®:M_(C) 
+ M,(C) is a normalized positive linear map, then ®(f(A)) = f(®CA)) for all 
n-by-n Hermitian A with eigenvalues in I. 


The next theorem gives two important special cases of Choi’s results; the proof 
is due to Ando [2]. 


Theorem 10. Let ®:M_(C) > M,(C) be any normalized positive linear map and let 
A in MC) be Hermitian. Then 


@( A”) > O( A)’ (5) 
and 
@(A!) = O( A) (6) 
if A is positive definite. 


Proof: We begin with the proof of (6). If C and F are positive definite, it is 
well-known that 


D* F 
({1, Lemma 1] or [13, Theorem 7.7.6]). If B > 0 and A in (0, ~) are given, it follows 
that 


| C 4 > 0 if and only if F => D*C'D (7) 


S B > 0 (8) 


B <B 


1997] INEQUALITIES FOR PRINCIPAL SUBMATRICES 615 


because \~'B > B*(AB)'B = A“'B. By continuity, (8) also holds if B > 0. Now, 
let the positive definite matrix A have eigenvalues A,, A,,..., A,, with correspond- 
ing orthonormal eigenvectors x,,xX,,...,x,. Then A = aa Aix x and AT — 


_, A, 'x,x* by the Spectral Theorem; each A; > 0, all x,x* > 0, and D?_,x,;x* = 


[,. Substituting A, for A and the positive semidefinite matrix O(x, x*) for B in ® 
gives the positive ‘semidefinite Hermitian matrix 


x A;P(4,x7 D(x; x7 | j=1 A P(x; x7 i “1 (xx mn 


P(x; x; ees =1P(4,x7 ith B(x;x 
O(L7_1 A, x; O( LP x; x7 ®(A) O(I,) 
DDI eat @(L7_ es Xx ry O(I,) (A) 
(A) I, 
I, (A) 


Therefore, (7) ensures that ®CA~!) > ®CA)7!. 
Kadison’s inequality (5) for positive definite A can be proved in the same way 


by replacing (8) with 
| B XB | 
> 0 


AB WB 
in the preceding argument. The result is that 
a | O(x;x7 A, B(x; xF I, D(A) 
py N;D(x;x*) AP O(x;x?* @(A) (A?) 
and hence that ®(A*) > &(A)*I,(A) = ®CA)’, which is (5) under the addi- 
tional assumption that A is positive definite. If A is only Hermitian, choose A > 0 


such that A + AI, is positive definite. The computation ®(A*) + 2A®CA) + A7/, 
= O(A + AI,)*) => OCA + AL)? = OCA) + 2ADCA) + A’L, gives (5). a 


As a consequence of his result that an operator monotone function f on (0, ©) is 
operator concave there, Ando [1, Theorem 4] observed that f(®(CA)) > ®(f(A)) 
for all positive definite A and all normalized positive linear maps. Ando’s result 
gives (6) for f(x) = —x"!. 

Theorem 9 can be used to obtain inequalities for scalars and Hermitian 
matrices. 


Example 2. For an n-by-n matrix A, let ®(A) = =tr(A), where tr(A) is the trace 
of A. If A,, A,,...,A, are given nonnegative numbers, form A = 
diag(A,, A,,..., A,) and consider the inequality B(fCA)) = f(®CA)). For f(x) = 
x*, this is AP + AS +++ +AS > n'“(A, +A, +--+: +A,)% which is valid for 
1<a<2 if the eigenvalues are nonnegative and for —-1<a<0 if they are 
positive. Ando’s observation shows that the inequality is reversed for nonnegative 
PQ 


eigenvalues and 0 < a < 1. 
Example 3. Let ©:M,.(C) > M,(C) be defined [1, p. 232] as o| Rs 


= 3(P + S), where P,Q, R, and S are in M,(C). The function f(x) = log x is 
operator monotone on (0, «) and by Ando’s result is also operator concave there. If 
C =A @ B, where A and B are n-by-n positive definite Hermitian matrices, then 
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log((A + B)/2) = log B(C) > Blog C) = (og A + log B)/2. This inequality can 
be rewritten as log((A + B)/2) > log A? +logB?. The same ® and the matrix 
A = diag(1, —.5) can be used to show that inequality (6) does not hold for 
arbitrary nonsingular Hermitian matrices. 


A note on further reading. A complete up-to-date treatment of matrix functions 
can be found in Chapter 6 of [14]. A more elementary presentation with many 
examples is in [4]. For many years, a useful reference has been [12]. A historical 
exposition of the various definitions of the concept of a matrix function is [19]. 
Loewner’s theory of monotone matrix functions is treated in detail in [10]. Other 
examples of matrix inequalities obtained from ®(f(A)) = f(®CA)) are in [1]. 
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Two Applications of Calculus 
to Triangular Billiards 


Eugene Gutkin 


1. INTRODUCTION. The pedal triangle, T,, of a given triangle T is formed by the 
feet of the three altitudes of T. Let Z be the space of all triangles. Then 
p:T > T, is a natural self-mapping of .7, the pedal mapping. The dynamics of 
iterations of p was investigated from several points of view in [10], [12], [16]. Here 
we will study the correspondence p:T +> T, from yet another, and completely 
different, perspective. 

We assume that the triangle T is acute, and view it as a billiard table. The pedal 
triangle 7,, which is inscribed in T, is then a periodic billiard orbit (see §2). 
Moreover, 7, is the shortest such orbit. Even more remarkable, for the general 
triangular table, the pedal triangle is the only closed (prime) billiard orbit known! 

The first proof, by calculus, that among all inscribed triangles the pedal triangle 
has the least perimeter, is attributed to J. F. F. Fagnano, ca. 1775. In his honor, the 
problem just stated is often called the Fagnano problem [4], [5], [15]. Elementary 
geometric solutions were later given independently by H. A. Schwarz and L. Fejer 
[14]. Schwarz and Fejer did their work at the end of the 19-th century and in the 
beginning of the current one. Thus, along with Fagnano, they are the primeval 
researchers in polygonal billiards! Following tradition, we will call 7, the Fagnano 
orbit. We reserve the name Fagnano geodesic for the Fagnano orbit, traced twice. 
Indeed, this is a closed geodesic on the degenerate closed Riemannian surface, 
T iouble> Which is the two-sided T. Every time a geodesic on T,,,,,,,. passes by an 
edge, it changes sides. Thus, if we are tracing 7, on T,,,,,4,., We have to ‘run along it 
twice’ to come back to the starting point. 

Let Z be the space of triangles in the euclidean plane, endowed with the 
natural topology. Two triangles, T and T” are close in this topology if we can label 
their vertices, T = ABC, T' = A’B'C’, so that A is close to A’, etc. We denote by 
Wt CF the subspace of acute triangles, and think of the elements of .Y as triangular 
billiard tables. The relative length of the Fagnano orbit, f(T) =|T,|/|T|, is a 
positive continuous function on .”. What is the maximum of f on .v, and on which 
T €@ is this maximum attained? What is the mean value of f with respect to the 
natural measure on .”? In what follows we answer these questions, using elemen- 
tary calculus. 


2. PERIODIC BILLIARD ORBITS. Let T < R’ be a bounded connected region 
with a piecewise C' boundary. The billiard in T is modeled on the motion without 
friction of a perfectly elastic point-mass (see Fig. 1). For emphasis, we call T the 
billiard table. Equivalently, the billiard in T corresponds to the propagation of light 
rays inside 7, where the boundary of T is a perfect mirror. 

Periodic billiard orbits in 7 correspond to inscribed polygons satisfying the 
angle of incidence is equal to the angle of reflection condition. These are the 
harmonic polygons in the terminology of G. D. Birkhoff [2], or the light polygons, 
according to M. Berger [1]. 
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T 


Figure 1. Billiard dynamics in a smooth, convex billiard table. 


Let |P| denote the perimeter of a polygon. Then harmonic polygons are the 
critical points of the function |P| on the space of polygons inscribed in T. This fact 
is crucial for the theorem about the existence of periodic orbits in any convex C' 
billiard table. Let p, q be positive integers. A periodic orbit with q sides that ‘goes 
p times around the table’ has rotation number p/q, and period gq. The theorem 
asserts that for any rational number, 0 < p/q < 1, with p and g relatively prime, 
there are at least two distinct periodic orbits of period q, with rotation number 
p/q. These are the so-called Birkhoff periodic orbits (see [15], [9]). This theorem 
extends, almost verbatim, to a large class of dynamical systems: area preserving 
twist maps [9, chs. 9-13]. 

If we remove the assumption that the table is C’, then the preceding assertion 
breaks down. For instance, no triangular table has a periodic orbit of period 2, 
with rotation number 1/2. Obtuse triangles have no periodic orbits of period 3, 
with rotation number 1/3. 

Little is known about periodic billiard orbits in arbitrary polygons (see [6], [7], 
[8]). It is not even known if every obtuse triangle has a periodic orbit, although 
there are many partial results in this direction [4]. 


3. THE FAGNANO PERIODIC ORBIT. In view of the preceding discussion, it is 
remarkable that the pedal triangle of any acute triangle is a canonical periodic 
billiard orbit, the Fagnano orbit. Let JT = A ABC be an acute triangle with angles 
A, B,C and sides a, b,c (Fig. 2). Let A,, B,,C, be the feet of the three altitudes 
of ABC. The similarity of the four triangles A.A,C,B, AA,B,C, AB,C,A, and T 
yields: angle AC,B = C, angle C,A,B = A, etc. (See Fig. 2). Thus T, = AA,B,C, 
is a harmonic triangle. 


B 


Figure 2. Pedal triangle, or the Fagnano periodic orbit. 


Since the pedal triangle 7, depends continuously on 7 € 7 in our topology, the 
length ratio, |7,|/|7|, is a continuous function on / But the pedal triangle has a 
billiard interpretation only for acute triangles T. A closely related fact (again, 
assuming J is acute): T, is the unique least-perimeter triangle, inscribed in 
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T [1, ch. 9.4]. In what follows, we treat T ©. as an independent variable, and call 
f(T) =|T,|/|T| the relative length of the Fagnano orbit. We regard f as a function 
on , thought of as the space of triangular billiard tables. 


Theorem 1. The maximal relative length of the Fagnano periodic orbit is 1/2. It is 
attained at the equilateral triangles. 


Proof: Let r be the radius of the circle circumscribed about A-ABC. Then [1, 
vol. 1] 


a=2rsn A, b=2rsinB, c= 2rsinC. (1) 
Similarity of the triangles A,C,B, A,B,C, B,C,A, and ABC yields 
|A,B,|=ccosC, |B,C,|=acos A, |A,C,| = bcos B, (2) 
and hence 
IT,|  sin2A + sin2B + sin2C 3 
IT| 2(sin_.A + sin B+ sinC) ’ (3) 
From A+8B+C = 7, using elementary trigonometry, we obtain 
sin2A + sin2B + sn2C = 4sin Asin BsinC, 
A B C 
sin A + sin B + sinC = 4cos — cos — cos —. 
2 2 2 
Thus 
IT, | A B.C 
—— = 4sin— sin — sin —. (4) 
\T | 2 2 2 


Let G be the group of conformal transformations of the plane, ie., G is 
generated by isometries and homotheties. The group G acts naturally on 7; let 
F/G denote the quotient space. Using the angles of a triangle as coordinates on 
F/G, we identify the closure of 7/G with the simplex {CA, B,C): 0 < A, B,C; 
A+B+C= 7} in R’. The projection of R° onto the (A, B}-plane represents 
F/G as the triangle {CA, B): 0 < A, B; A+B < zw}. The subspace ¥/G CI/G 
corresponds in this representation to the triangle {CA, B):0 < A,B < 7/2; A+B 
> 1/2}; see Fig. 3. 


Figure 3. The space of triangles. 


620 APPLICATIONS OF CALCULUS [August-September 


Since the billiard dynamics depends only on the intrinsic properties of the table, 
F/G is the appropriate space of triangular billiard tables. Setting x = A/2,y = 
B/2, |T,|/|T| = f(x, y) and using (4), we come to the problem of maximizing the 
function f(x, y) = 4sin x sin ycos(x + y) on the closed domain FD = {(x, yy) : 
0O<x,y < 7/4<x+y}. By our preceding remarks, 9 is the closure of ¥Y/G, 
and we think of elements of F as acute triangular billiard tables. 

By elementary trigonometry 


f, = 4sin y cos(2x + y), f, = 4sin x cos(x + 2y). 


Hence the only critical point of f in D is the point z, = (7/6, 7/6) € int(D). A 
computation shows that the Hessian of f at z, is negative definite, so z, is a local 
maximum and the corresponding critical value is f(z,) = 1/2. 

The point z) corresponds to the equilateral triangle A = B= C = 7/3. In 
order to conclude that the equilateral triangle has the relatively longest Fagnano 
orbit, we compute the maximal value of |T,|/|T| on the boundary, 09D, which 
consists of right triangles ABC. By symmetry, it suffices to maximize |T,|/|T| on 
the boundary component given by C = 7/2,0<A < 7/4, B= 7/2 —A. This 
corresponds to the interval 0 <x < 7/8, y = 7/4 —x in our parametrization. 
Restricting f(x, y) to this interval, we obtain $(x) = V2[cos(a/4 — x) — 1/ v2]. 
This function is monotonically increasing on [0, 7/8], from #(0) = 0 to 
b(a/8) = V2 — 1. Thus, the (7/4, 7/4, 7/2)-triangle maximizes |T,|/|T| on dQ, 
and the corresponding maximal value is (2 — 1 < 1/2. | 


The space 7 is, essentially, R°. The natural measure on 7 corresponds to the 
Lebesgue measure on R°. Under the projection 7-~ 7/G it reduces to the 
Lebesgue measure cdxdy on the square {(x, y):0 <x, y < 7/4}. The value of 
the constant c > 0 is irrelevant for our purpose, and we set c = 1. 


Theorem 2. The average relative length of the Fagnano periodic orbit, with respect to 
the Lebesgue measure on the space of acute triangles is 127°‘ — 24n°7 — 1 = .39. 


Proof: We use the notation and the setting of the proof of Theorem 1. By our 
preceding remarks, the quantity we are after is the mean value, with respect to the 
Lebesgue measure, of f(x, y) = 4sinxsin ycos(x+y) on the region D = 
((x,y):0 <x,y< 7/4<x+4+y}. 

Set I = {{,f(x, y)dedy. By elementary trigonometry, f(x, y) = cos2x + 
cos2y — cos2(x + y) — 1. Using the symmetry of J, we write 


[= 2J {cos 2xdxdy — J [es 2(x + y) dxdy — area(Q). 


The integrals are straightforward to evaluate, and we omit the details. We have 


| [ cos2xdedy = ~ | [cos 2x + y) dxdy = “(= — i) 
Thus 


[= “(3 — 7 — area(D). 


Since @ is the right isoceles triangle with side 7/4, we have area(D) = 17/32, 
and the mean value is: 


12(7 — 2 
(fy = ATO |_| 
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Recall that f(T) is the relative length of the Fagnano periodic orbit in T. By the 
proof of Theorem 1, the range of f is the interval [0, .5]. Note that the mean value 
of f, which is approximately .39, is much closer to .5 than to zero. 


4. CONCLUDING REMARKS. The equilateral triangle almost invariably pops up 
as the solution to any geometric—analytic optimization problem for triangles [13]. 
Thus, Theorem 1 is far from surprising, and may well be in the literature. We note 
that the equilateral triangle is the unique fixed point of the pedal mapping, 
p:T-T,, of 7/G. Let T€Z/G be arbitrary, and set T, = p"(T), the ‘pedal 
triangle of n-th generation’. For almost any T the sequence T = 7,,T,,T),... is 
infinite, i. e., the triangles 7, never degenerate. The equilateral triangle is 
characterized by the condition that T, is acute for all n. Both remarks follow from 
the representation of the pedal mapping in [12]. 

Theorems 1 and 2 suggest a possible approach to the elusive problem of 
periodic orbits in polygons [3], [11]. Let 7 be a specific type of orbits, and let 
FP =F(r) be the space of polygons that have a periodic orbit of type 7. The 
billiard orbit of a given type in a polygonal table, if it exists, is essentially unique. 
Let P, CP be the corresponding inscribed polygon. The ratio |P,|/|P| = f,CP) is 
the relative length of the orbit of type 7. We view f, as a function on F. It may be 
useful to investigate these functions from the point of view of Theorems 1 and 2. 
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Early Transcendentals 


Steven H. Weintraub 


Several current calculus texts have “early transcendental” versions, in which the 
exponential and logarithm functions are introduced early in the text. These 
functions are usually justified by various “hand-waving” arguments. The point of 
this article is to show how they may be introduced rigorously. 

We proceed in two steps. We begin with the basic existence theorem (BET): 
There is a differentiable function y = f(x), defined for x > 0, satisfying 


FO) =1, f(x) =f(x) (*) 


for all x > 0. From the BET we derive that there is a unique function f(x), 
defined and differentiable for all real x, that satisfies (*) for all real x. This 
function is of course the exponential function. We then derive the basic properties 
of the exponential function. These derivations provide a beautiful illustration of 
some of the basic elements of calculus: the greatest lower bound property, the 
intermediate value theorem, the mean value theorem, and the chain rule. 

We then prove the BET by using power series. At first glance this may seem to 
delay the introduction of transcendental functions even more than the usual 
approach, as power series are introduced rather late in most calculus courses 
(though, logically speaking, it is not necessary to do so). However, (*) is carefully 
chosen so that we may prove it with only a minimum of the theory of power series 
—we need consider only series with all terms non-negative, and the only conver- 
gence criterion we use is comparison with a geometric series. Thus this material 
may be introduced separately, near the beginning of the course, and returned to 
later, when power series are considered in detail. This approach has another 
advantage. Introducing the power series for the exponential function early not only 
introduces the exponential function early, but also provides a method for actually 
computing it. 


1. THE BASIC EXISTENCE THEOREM AND ITS CONSEQUENCES 


Theorem 1.1 . (The basic existence theorem = BET ) There is a differentiable function 
y = f(x), defined for x = 0, satisfying 
f(0)=1, f(x) =f(*) (*) 
for allx = 0. 
For the remainder of this section we assume the BET. We prove it in the next 
section, as Corollary 2.16. 
Note that the BET does not tell us that there is such a function f(x), much less 


a unique such function, defined for all real x. Our first objective is to derive this 
from the BET. Actually, we prove a slightly more general result in Theorem 1.4. 


Lemma 1.2. Jf f(x) is a function satisfying the BET, then f(x) > 0 for allx = 0. 
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Proof: Suppose there is an x >0O with f(x) < 0. Then the set S = 
{x => 0| f(x) < 0} is non-empty. Since S is bounded from below by 0, S has a 
greatest lower bound x, => 0. Then, by the continuity of f, we have f(x,) < 0. 
Since f(0) = 1, we have x, > 0. Then, 1 = f(0) > f(x,), so, by the mean value 
theorem, there is an x, with 0 <x, <Xy, and such that f’(x,) < 0. Thus f(x,) = 
f'(x,) < 0, and so x, € S. But x, <x ), which contradicts x, being a lower bound 
of S. We conclude that f(x) > 0 for all x > 0. a 


Lemma 1.3. There is a function e(x) that satisfies (*) for all x © R, and furthermore 
has e(x) > 0 forallx ER. 


Proof: Let f(x) be a function satisfying the BET. Define e(x) by 


— { f(x) x>0 
(x)= Vax) x <0. 


This definition is consistent for x = 0 as f(0) = 1 = 1/f(O). Note that, by Lemma 
1.2, e(x) is defined and positive for all x € R, and is always positive. Clearly e(x) 
is differentiable for x > 0. As a composition of differentiable functions, e(x) is 
differentiable for x < 0. Clearly e’(x) = e(x) for x > 0. For x < 0 we have, by the 
chain rule, 


d d . ; 
= (e(2)) = UA 2) = (-f'(-2) f(y) (0) 
= f'(-x)/f(—x)’ = f(-x)/f(-x)’ = 1/f(-x) = e(x). 


For the same reasons, the derivative from the right at x = 0 exists and has the 
value 1, and the derivative from the left at x = 0 exists and has value 1, so e’(0) 
exists and has value 1. Thus e’(x) exists for all x, e’(0) = 1, and e’(x) = e(x) for all 
x, 80 e(x) satisfies (*«) for all x € R. | 


Theorem 1.4. Fix real numbers a and b. There exists a unique differentiable function g 
satisfying 

g(0) =b, — g'(x) = ag(x) (**) 
forallx € R. 


Proof: Let e(x) be any function satisfying Lemma 1.3. Define g(x) by g(x) = 
be(ax). Then g(0) = be(0) = b: 1 = b and, by the chain rule, 


g'(x) = b(e'(ax)-a) = a(be'(ax)) = a(be(ax)) = ag(x) 
so g(x) satisfies (* *). 
Now let g(x) be any function satisfying (« *). Let h(x) be the quotient 
h(x) = g(x) /eCax). (Note the denominator is never zero.) Then h(0) = g(0)/e(0) = 
b/1 = b and, again by the chain rule, 


e(ax)(ag(x)) — (ae(ax))g(x) _ 
a 
e(ax) 

for all x, so A(x) is constant. Hence, h(x) = h(O) = b for all x € R, ie., g(x) = 

be(ax). Choosing b = 1 and a = 1 shows that g(x) = e(x) in this case, i.e., that 


there is a unique function satisfying the conclusion of Lemma 1.3. Letting b and a 
be arbitrary gives Theorem 1.4 in general. | 


h'(x) = 0 


624 EARLY TRANSCENDENTALS [August-September 


Now that we have unique functions with our desired properties, we can give 
them names. 


Definition 1.5. For a fixed real number a, let exp, (x) be the unique differentiable 
function satisfying 


exp, (0) = 1, exp’, (x) = aexp,(x) forall x < R. 
We abbreviate exp,(x) by exp(x). 


Corollary 1.6. For any fixed real number a, exp,(x) = exp(ax). 


Proof: exp(a0) = 1 = exp,(0), exp’(ax) = aexp(ax), and exp!,(x) = aexp,(x), 
so, by the uniqueness part of Theorem 1.4, exp(ax) = exp,(x) forallx<eR. 


The next two theorems give the basic properties of the exponential function. 


Theorem 1.7. The function exp(x) has the following properties: 


a) exp(x) > 0 forallx ER. 

b) exp(x) is strictly increasing. 

c) exp(x) > 1+ x forallx €R. 

d) lim, ,,.exp(x) = +, lim, ,_.,exp(x) = 0. 
e) For any n, lim, _,,,.exp(x)/x” = , 


Proof: 

a) exp(x) = e(x) is positive for all x € R by Lemma 1.3. 

b) exp’(x) = exp(x) > 0 for all x so, by the mean value theorem, exp(x) is 
strictly increasing. 

c) If x =0 then exp(x)=1+-x. Suppose x > 0. Then, by the mean value 
theorem, exp(x) — exp(0) = exp’(c)(x — 0) for some c with 0 <c <x. But 
exp’(c) = exp(c) > exp(0) = 1, so exp(x) > exp(0) + exp(c)x > 1+ for 
x >0. A similar application of the mean value theorem shows that 
exp(x) > 1 +x for x < 0. 

d) It follows immediately from c) that lim, , ,..exp(x) = +. By the uniqueness 
of exp(x), and by Lemma 1.3, exp(x) = 1/exp(—x) so lim, _,_.,exp(x) = 0. 

e) Let f(x) = exp(x)/x"*'. Then f’(x) = (x"*!exp(x) — exp(x)nx")/x7"*? = 
(x — n)exp(x)/x"** > 0 for x >n. Hence f(x)>f(n) for x>n, ie. 
exp(x)/x"t' > exp(n)/n"*! for x >n, so exp(x)/x" > x-exp(n)/n"*! for 
x >n, and so lim, .,,,.exp(x)/x” = +, a 


Theorem 1.8. For any real number a, and for any real numbers w and x, 
exp, (w + x) = exp, (w)exp, (x). 


Proof: Fix a and w and set h(x) = exp,(x + w)/exp,(w). (Note the denominator 
is non-zero.) Then h(O) = 1 and h'(0) = ah(O). Comparing Definition 1.5, we see 
that h(x) = exp,(x), ie., 


- exp,(x + w) 
exp,(x) = exp,.(w) , a 


Corollary 1.9. For any integer n and all x € R, exp,(nx) = exp,(x)”. 
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Proof: Induction and Theorem 1.8. | 


Definition 1.10. The real number e is defined by e = exp(1). (Note e = exp(1) > 
exp(0) = 1.) 


Theorem 1.8 and Corollary 1.9 justify the notation 
e* = exp(x) (1.11) 


as e° = 1, e ' =1/e, e™*” =e”-e" and e”” =(e™)", the usual laws of expo- 
nents, for any rational numbers m and n. 

The continuity of exp(x), and parts b) and d) of Corollary 1.7, imply that the 
exponential function exp : R — (0, ©) has a continuous inverse. 


Definition 1.12. The natural logarithm function In : (0,%) > R is the inverse of 
exp: R > (0,~). 


From Definition 1.2, Theorem 1.8 and Corollary 1.9 (with a = 1) it is easy to 
derive the usual properties for logarithms: for x, y > 0, In(xy) = In(x) + In(y), 
In(1 /x) = —In(x), In(x”) = nIn(x). More interesting are the following properties: 


Proposition 1.13 


a) In) = 0. 
b) In(x) is differentiable for all x € R and |n'(x) = 1/x. 


Proof: Part a) is immediate from 1 = exp(0). As for part b), recall a general result 
about inverse functions: If f is a differentiable function with inverse g, and 
f'(a) # 0, then g is differentiable at f(a) and g'(f(a)) = 1/f'(a). We apply this 
here with f = exp and g = In. Then f’(a) = f(a) # 0 for any a. Set a = In(x). 
Then f(a) =x, so g'(x) = g'(f(a)) = 1/f"(a) = 1/f(a) = 1/x. a 


Remark 1.14. It is common to define In(x) by 


In(x) = [ i/eat. (1.15) 


If we define In(x) by Definition 1.12, then (1.15) becomes a theorem that is an 
immediate application of Proposition 1.13 and the fundamental theorem of calcu- 
lus. 


For any positive number a, we can now define a* by 


a” = EXPincgy(X) = exp(xin(a)). (1.16) 


Again a’ satisfies the usual laws of exponents (a° = 1,a' =a,a"*" = aa", 
(a”)” =a™”"). Also, by Definition 1.5, 


d x d x 
Zi ) — Gy (exPincay*)) _ In( a) exp in¢ay( *) =a In(a). (1.17) 


2. PROOF OF THE BASIC EXISTENCE THEOREM. In this section we need to 
use some basic facts about series. However, we always deal with series with 
non-negative terms, and that simplifies the proofs of these facts. For a series with 
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non-negative terms has increasing partial sums, and it is easy to show that an 
increasing sequence that is bounded from above converges to its least upper 
bound. Beyond this, all we need is that if Lb, and Lc, are series of non-negative 
terms that converge to B and C respectively, then Lb, + c, converges to B + C, 
and itb, converges to ¢B for t => 0. If, in addition, b, >, for all n, we need to 
know that B = C, and also that X(b, — c,,) converges to B — C. 

We now consider series of the form 


y) a,x", xER. 

n=0 
If, for some value of x, this series converges to some value S, we say that the sum 
of the series is § = S(x). In this way we define a function, and we write 


S(x) = y a,x". (2.1) 


Definition 2.2. S(x) has non-negative coefficients if a, = 0 for all n = 0. 


Definition 2.3. Let E(x) be the series 


2 x? x4 


x 
E(x)=1+txt o> +a +t 
We will be interested only in series with non-negative coefficients, and the sums 
of these series for values of x > 0; in particular, we are most interested in the 
series E(x), and its values for x > 0. In this general situation the partial sums of 
S(x) always form an increasing sequence. 


Definition 2.4. A series S(x) with non-negative coefficients is eventually geometri- 
cally dominated for x = x, if there is a real number C, a real number r with 
0 <r < 1, and an integer N such that a,xj < Cr” foralln =N. 

Definition 2.5. A series S(x) is called peaceful if 


a) S(x) has non-negative coefficients, and 
b) S(x) is eventually geometrically dominated for every x = 0. 


Every series S(x) is eventually geometrically dominated for x = 0 (choose 
C =0, r= 0, and N = 1), so the condition in Definition 2.5 b) is non-trivial only 
for x > 0. 


Lemma 2.6. Let S(x) = L7_,)a,x" with a, = 0 for all n, and suppose that a,, > 0 
for n sufficiently large and that lim, ,.€,41/@, = 9. Then S(x) is peaceful. 


Proof: Let x > 0 be arbitrary. Choose N so that a, > 0 and a,,,/a, <1/(x + 1) 
for n = N; this is possible as lim, ,..4,,,/a, = 0. Let r=x/(x + 1) < 1. For any 


n>WN we have 
n a, an-1 an+1 N 
a,x" = |x x “1X Anx 
an-| a) an 


< ayxNr™N = fayxNrN hr", 


so S(x) is eventually geometrically dominated. a 
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Corollary 2.7. E(x) is peaceful. 
Proof: For E(x), a, = 1/n!,so a,,,/a, =1/(n + 1) and lim, ,..a,,,/a, = 0. = 


Lemma 2.8. Let S(x) = _,a,x" be a peaceful series. Then this series converges for 
everyx = 0. 


Proof: Fix x = 0. Since the coefficients a, are all non-negative, and x > 0, each 
term in the series is non-negative, and so the partial sums form an increasing 
sequence. S(x) is eventually geometrically dominated, so pick C, r, and N as in 
Definition 2.4. Then 


oe) N-1 oe 
Ao n n 
yi a,x"= Yia,x"+ Yia,x 
n=0 n=N 
Let us compare this with the series 
N-1 00 
yy a,x" + YS Cr”. 
n=0 n=N 


Consider this second series. Its first summand is a finite summand; denote its sum 
by S,. Its second summand is a geometric series with sum Cr’ /(1 —r); in 
particular, this sum is an upper bound of the partial sums. Each term in the first 
series is less than or equal to the corresponding term in the second series, so the 
partial sums of the first series are bounded from above by S, + Cr‘ /( —r). 
Thus, this series converges. = 


Lemma 2.9. Let S(x) be peaceful. Then S(x) is an increasing function on [0, ©). 


Proof: If x, > x,, 


S(x,) — S(x,) = do a,x} -— doa,xt = Yo a,(x}—x7) = 0 
n=0 n=0 n=0 
ie., S(x,) = S(x,). a 


Lemma 2.10. Let S(x) be peaceful. Then S(x) is continuous for all x > 0. 


Proof: We have to show that for every x, = 0, and for every « > 0, there exists a 6 
such that |S(x) — S(x,)| < ¢ whenever |x — x,)|< 6 and x = 0. Let x, =x) + 1. 
Then S(x,) exists, and is the limit of its partial sums, so there is an N such that for 
any x <x), 


00 oo N 
0< Yi ax"< YO a,x? =S(x,) -— Yo a,x? < &/2. 
n=N+1 n=N+1 n=0 


Now f(x) = &7_, a,x, is a polynomial, and hence a continuous function, so there 
exists a 6, > mi with 0 < f(x) — f(xo) < e/2 for x» <x <x + 6,. Thus, if we let 
6, = min(1, 6,), then for Xy SX < m + 6,, 


0<S(x) — S(x)) = 


| 
eas 


~ Yaya 
n=0 = 
00 N 00 N 
< Yia,x"— dia, da, x" + YE a,x" — dE a, xg 
n=0 n=0 n=0 n=N+1 n=0 
00 N N 
= Yi a,x"+| dYia,x"- Yia,xt| <e/2+6/2=e. 
n=N+1 n=0 n=(0 


628 EARLY TRANSCENDENTALS [August-September 


A similar (in fact, easier) argument for x < x, (taking x, = x, there) shows there 
is a 6_ with —e < S(x) —- S(x)) <0 for x, — 6_<x <x); setting 6 = 
min(6,, 6_) we see that |S(x) — S(x,)| < e for |x —x,| < 6, x = 0, as required. 

a 


Definition 2.11. For a given series S(x) = L%_,a,x", let S(x) = 
Le _ o(n + 1)a,,,x” be the series obtained by differentiating S(x) term-by-term. 


Example 2.12. For the series E(x), we have 


2x 3x? 4x? 


E(x)=l+ os +a tqrt 
x7 x? 
-itxt+ state = E(x). 


Lemma 2.13. Let S(x) = ¥%_,a,x" be peaceful. Then S(x) is also peaceful. 


Proof: Write S(x) = X2_,b,x", so b, = (n+ Da,,, = 0 for all n. Pick an x > 0. 
Then, since S(x) is peaceful, there are a real number B, an integer M, anda q < 1 
with a,x” < Bq” for all n => M. Choose N = max(M,q/(1 — q)). A little algebra 
shows that if N >q/(Q — q), r=q(N + 1)/N < 1. Then, for n = N, 

b,x" =(n+ 1)a,,,x" < (n+ 1)Bq"**/x. 


But for n > N, 


n-+1 n n—-1 N+1 
n+1- [-—— ||}... 
n n-1l)/\n-2 N 
and, since 
n+1 n N+1 
< Kw , 
n n—-1 N 
we have 
N+1\"4t! N+1\"(N+1\'% 
n+1< sa 
N | N 
SO 
, N+1 a (x=) C 
Nie n — n 
nx -— N | q/xX>q N r 


where C is the expression in braces. Thus, S(x) is eventually geometrically 
dominated for x. = 


Corollary 2.14. If S(x) is peaceful, then S(x) is continuous for all x > 0. 


Proof: If S(x) is peaceful, then S(x) is peaceful by Lemma 2.13, and then S(x) is 
continuous by Lemma 2.8. = 


Theorem 2.15. If S(x) is peaceful, then S(x) is differentiable for all x = 0, and 
S'(x) = S(x). 
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Proof: We have to show that for every x, = 0 and every ¢ > 0, there exists a 6 > 0 
such that 


— §(x))| <« 


X —Xo 


xo ~ S(%) 


whenever 0 < |x — x,|< 6 and x > O. Since S(x) is continuous at x,, there is a 
5 > 0 so that |S(x) — S(x,)| < ¢ whenever |x — x,| < 5 and x > 0. Now 


S(x) — S(x 1 °° 
( ) ( 0) _ (ra x" - La) 
X— Xo X —Xo n=0 
ore) ore) xrtl — ntl 
= a(x" —xP)= a 
X — Xp 2» nl 0) x n+1 xX — Xp 


We have the identity x"*' —x"t! = (x4 — x) Mx" +x"! xy $x 2x? + 4x7), 
so (x"*! — x@*!)/(x —x)) =x" + +++ +x”, (There are n + 1 terms.) 
Suppose that x, <x. Then we see that, setting y, = (x”*! — xf*!)/(x — x), 


n+1)xi <y, <(n+1)x", 
0 Yn 


and so 
» (n + 1)A,44%9 Ss » Gn+1Yn < » (n + 1)a,44%x", 
n=0 = n=0 
1.€., 
~ S(x) -—S(x%)) 
S < —— < (x), 
(Xp) x—X, (x) 
and so 


< S(x) — S( x9) 7 


S(x) < 8(x) — S(x,) <« 


(where the last inequality is true by our choice of 6). A similar argument holds if 
X< Xp. a 


We now arrive at our desired conclusion: 


Corollary 2.16. Let E(x) = 14x + (x7/2!) + (4° /3!) +--+. Then E(O) = 1 and 
E'(x) = E(x) for all x > 0, i.e., E(x) is a function satisfying the basic existence 
theorem (BET). 


Proof: We see E(0) = 1 by direct substitution. By Theorem 2.15, E(x) is differen- 
tiable for all x > 0 and E'(x) = E(x) = E(x). a 


Remark 2.17. Combining Definition 1.5 and Corollary 2.16, we see that 


2 x? 


x 
exp(x) =1+txts +a, t+ for x > 0 

and exp(x) = 1/exp(—x) for x < 0. This gives us a method of calculating exp(x). 

For example, taking the sum of the first twelve terms of this series for x = 1 gives 

the approximation 2.71828182 for e. 
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Remark 2.18. We may generalize the results of this section as follows: Let A > 0 
and say S(x) is A-peaceful if it satisfies Definition 2.5 a) and Definition 2.5 b) 
for 0 <x <A. Then the results of this section all hold for A-peaceful series 
with the restriction that 0 <x <A in their conclusions. In Lemma 2.6, the 
hypothesis becomes that a, > 0 for all n, a, > 0 for n sufficiently large, and 
lim,,,4,,/4, = 1/A, and the conclusion becomes that S(x) is A-peaceful. The 
proofs are almost the same. 

In particular, consider the series 

ee x 

L(x) =x+—+—>4°". 

2 3 
It is easy to check that L(x) is 1-peaceful. Then L(x) = 1 +x 4x7 +43 +05, 
and we recognize this as a geometric series with sum 1/(1 — x) for 0 <x < 1, Le., 
L(x) = 1/(1 — x) for 0 < x < 1. Thus we see that L(x) = L(x) = 1/( — x) for 
0 <x < 1. On the other hand, if f(x) = —In(U — x), then, by Proposition 1.13 and 
the chain rule, f’(x) = 1/0. — x) for 0 < x < 1. We have two functions with the 
same derivative, so they must differ by a constant (a consequence of the mean 
value theorem). But L(O) = 0 = f(O), so we conclude L(x) = f(x) = —In( — x) 
forO <x <1. 

This gives us a method of computing natural logarithms: For 0 < y < 1, set 
x=1-y (so y=1-~x) and then In(y) = —-L(x) = -LU —y). We know 
Ind) = 0. For y>1, set x =(y—-1)/y (so 1—x=1/y) and then In(y) = 
~In(1 /y) = L(x) = L(y — 1)/y). 
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The Logical Structure of Computer-Aided 
Mathematical Reasoning 


Keith Devlin 


1. NEW MATH. Over the past decade or so, the professional mathematician has 
changed from being a person who sits at a desk working with a paper and pencil to 
a person who spends a lot of time sitting in front of a computer terminal. The 
paper and pencil are still there, but a lot of the mathematician’s activities now 
involve use of the computer. In particular, powerful computer packages like 
Mathematica and Maple advertise themselves as systems for “doing math on a 
computer.” This rapid transformation of mode of working has changed the nature 
of doing mathematics in a fundamental way. Mathematics done with the aid of a 
computer is qualitatively different from mathematics done with paper and pencil 
alone. The computer does not simply ‘assist’ the mathematician in doing business 
as usual; rather, it changes the nature of what is done. In particular, the logical 
structure of mathematical reasoning carried out with the aid of an interactive 
computer system is different from the structure of the more traditional form of 
mathematical reasoning. This paper, in part survey, in part a presentation of some 
new ideas (for none of which the author claims priority or unique observation), 
compares the old and the new from a logician’s standpoint. 

Strictly speaking, many of the points raised about computer-aided mathematics 
are not restricted to mathematics done on a computer. Most of the points apply, to 
some extent, to mathematics done in the fashion familiar to the ancient Greeks. 
However, the introduction of computer-aided techniques in mathematics has made 
those points far more salient, indeed unavoidable. 


2. PROOFS AND REFUTATIONS. Ever since the ancient Greek mathematician 
Thales introduced the notions of theorem and proof in the sixth century B.C., 
proofs have played an important role in mathematics. It is by means of logically 
rigorous ‘proofs’ that the truths of mathematics are ultimately determined. One of 
the crowning glories of late nineteenth and early twentieth century logic was the 
formulation of a formal definition of the notion of a mathematical proof. 

According to mathematical logic, a proof of a mathematical assertion ® 
consists of a finite sequence ¢,, ¢5,...,¢, such that ¢, = ®, and for each 
i=1,...,n—1, ¢,,, is either an axiom or else follows from q,,...,¢, by an 
allowable rule of logical inference (such as modus ponens). In order to make this 
completely precise, the statements ®, ¢,,...,¢, should be written out in a 
formally specified language, such as predicate logic. 

However, a glance at any mathematical textbook, research monograph, or 
research paper chosen at random—even one in mathematical logic—will indicate 
that mathematicians almost never write proofs in the strict form specified by the 
logician’s definition. In fact, the only arguments that are ever written out in strict 
logical fashion are proofs of extremely simple assertions given as illustrations of 
formal proofs, not proofs of statements of any genuine mathematical interest. 
There is good reason for this: for all but the simplest of mathematical assertions, a 
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proof written out according to the rules of formal logic would be practically 
impossible to follow or to understand. 

The logician’s notion of a formal proof is not a ‘template’ for the formulation of 
actual proofs; rather it is an ‘idealization’, or an ‘abstract model’, of a proof. A 
correct proof can in principle be formulated according to the strict rules of logic 
(and an invalid ‘proof’ cannot), and in this asymptotic sense the logician’s notion of 
a formal proof relates to actual proofs in mathematics. But is there a tighter 
connection than this? How closely do the logician’s proofs correspond to actual 
mathematical practice? Does the formal definition of a proof tell us very much 
about everyday mathematical arguments? 

This question cannot be answered without first making precise just what is 
meant by the phrase ‘everyday mathematical arguments’. 

Ordinary mathematical practice makes at least three different, though overlap- 
ping, uses of methods of reasoning referred to as ‘logical arguments’. 

First of all, there are what might be called proofs for the record. These are the 
proofs that mathematicians give in their published research papers. 

Second, there are the pedagogic arguments, the arguments mathematician use in 
order to explain and to convince their colleagues or their students of the truth of a 
particular assertion. 

Third, there are the action arguments. These are the arguments mathematicians 
use in the course of solving a problem. 

What are the differences between these three notions of a mathematical 
argument, and how does each relate to the logician’s notion of a proof? 

The logician’s formal notion of a proof really corresponds only to proofs for the 
record. Arguments of the second and third kinds are generally very different. 
Moreover, the distinction between proofs for the record and the other kinds of 
argument is growing more significant as a result of the growing use of computers 
(and other technological tools) in mathematics. 

In fact, the pedagogic and action arguments are the ones that most typify, and 
play the most significant role in, actual mathematical practice. Consequently, and 
contrary to the view implicit in classical logic, mathematics is, I suggest, far more 
like (Popper’s description of) natural science than it resembles formal logic. 


In his enormously influential book Conjectures and Refutations: The Growth of 
Scientific Knowledge, published in 1963 [8], the philosopher Karl Popper put 
forward the thesis that scientific theories are not, and cannot be, ‘proved’. Rather 
a particular theory is at best accepted for the time, based on its agreement with, 
and prediction of, relevant observed facts, and remains accepted until refuted by 
some means, either a convincing counterargument or the observation of evidence 
to the contrary. Though nowadays widely accepted, Popper’s suggestion runs 
counter to the generally accepted view of scientific knowledge that prevailed over 
the preceding centuries, whereby the primary task of the scientist was to obtain 
evidence to confirm a particular theory—the more confirming evidence that was 
obtained, the more secure was the theory taken to be. According to Popper, the 
scientist tests a theory not by confirmation but by seeking refutation. The longer a 
theory withstands attempts to find a refutation, the more one might feel confident 
in the theory and follow its predictions. 

It seems clear to me that, if you examine what mathematicians actually do in 
their daily business—and ignore what it is they often say they are doing—then 
what actually constitutes a proof is far closer to Popper’s idea than to the logician’s 
concept of a formal proof. In the everyday world of mathematics, a proof is, I 
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suggest, nothing more nor less than an argument that: 


(i) has been declared correct by a number of mathematicians acknowledged by 
the mathematical community to be capable of making such judgments; 
(ii) has not (yet) been refuted. 


Of course, the comparison between mathematics and natural science can be 
taken only so far. Part of Popper’s thesis is that a scientific theory can only be an 
‘approximation’ to reality; whether or not there is such a thing as ‘truth’ in science, 
our knowledge of the world can never be absolute. In contrast, mathematical truth 
is determinate and eternal, and our knowledge of a mathematical fact is, at least in 
principle, absolute. It is in its daily practice that mathematics resembles (I claim) 
Popperian science. What makes the comparison of interest is that the daily 
practice of mathematics is what mathematicians actually spend most of their time 
doing! 

Incidentally, the title of this section is based on the title of Popper’s book. 
Popper’s work also inspired philosopher Imre Lakatos to name his 1976 book 
Proofs and Refutations. The parallel with the work I present in this paper is far 
closer to Popper’s arguments about science than to Lakatos’s discussion of mathe- 
matical proofs. 


3. REAL DEDUCTION. Mathematical logic deals almost exclusively with a highly 
idealized kind of ‘reasoning’ that is not the reasoning that mathematicians actually 
employ. This is not at all a criticism of mathematical logic. On the contrary, 
mathematical logic succeeds brilliantly in its stated aim—to analyze the notion of 
mathematical proof as an idealized concept. Logic does not set out to investigate 
the daily praxis of mathematics. But what then of that daily praxis, which classical 
logic ignores? That daily praxis is the topic of this paper. What is the logical 
structure of the actual mathematical reasoning processes that mathematicians use 
in their daily work? 

Specific objections to classical logic as a model of actual mathematical reasoning 
are that it takes no account of the following two features of actual mathematical 
reasoning: 


¢ The use of diagrams and visual reasoning processes. 
¢ Interactivity and dynamic representations. 


The use of diagrams and visual reasoning have always played a major role in 
both pedagogic arguments and action arguments. And, as a result of the develop- 
ment of computers and other technological aids to reasoning, we are surely going 
to see a continuation of the steady rise in the use of interactivity and dynamic 
representation both in actual reasoning and in the subsequent presentation of 
arguments and results. 

Of course, since the time of the ancient Greeks, mathematicians have made 
extensive use of diagrams, even in proofs for the record. But such uses have always 
been essentially as peripherals, to simplify the notation, to help make adequate 
reference to various mathematical entities, or to generally aid the reader’s compre- 
hension. It was regarded almost universally as forbidden for a proof for the record 
to make essential use of a diagram. 

An obvious exception to the preceding statement, it might be suggested, were 
the ruler and compass constructions of the ancient Greeks, where the diagram was 
the very focus of the argument. However, even in that case one could argue that 
the main issue was what figures could be constructed in principle, and that the 
actual ‘proof did not depend on any particular diagram. Rather, the reader was 
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invited to convince him or herself, by abstract, internal reflection, that a certain 
sequence of steps would always yield the requisite figure. Indeed, it is a crucial 
aspect of writing up a ruler and compass ‘proof’, or indeed any proof in geometry, 
that any diagram drawn is ‘generic’, having no special features other than those 
specified in the problem. For instance, results about general triangles should not 
be obtained using diagrams of right-angled or equilateral triangles, a result about 
ellipses should not be illustrated with a diagram of a circle, and so forth. 

The reason for avoiding diagrams with significant features not specified in the 
problem is both obvious and sensible. A valid proof cannot make use of assump- 
tions other than those that are given (either explicitly or implicitly) in the problem. 
For example, a proof of a general result about triangles that depends upon the 
isosceles nature of the particular triangle drawn in the diagram will not be 
valid—at best it will be a proof that applies only to isosceles triangles. And so 
forth. 

The danger in using diagrams—any diagram—is that there is always more 
information present in the diagram than stipulated in the problem. Even if, in 
attempting to prove a result about arbitrary triangles, a mathematician tries to 
draw a ‘generic’ triangle, it is impossible to avoid the diagram incorporating 
unwarranted assumptions. For example, which of the two triangles shown in Figure 
1 is the more ‘generic’? Each has properties not shared by the other, and is thus 
not typical of all triangles. 


Figure 1. Which is the ‘typical’ triangle? 


The point is, diagrams provide extremely rich and powerful representations of 
information. This power certainly does have the potential to mislead. But it is also 
the reason why mathematicians find the use of diagrams almost indispensable in 
their daily work. The use of the human mind to process written language, in 
particular written mathematical proofs, is a very recent evolutionary development, 
whereas humans have been using their brains to process visual information since 
our species first evolved. The very survival of our species depended on the human 
visual system, and by far the greatest proportion of the human brain is devoted to 
handling the input from the eyes. We are then far better equipped to handle 
scenes and illustrations than we are to deal with written language. 

As a simple but dramatic illustration of the power of visual representations, 
consider Figure 2, which shows the layout of a number of blocks on a table. The 
task you are faced with is to determine the least number of blocks that must be 
removed in order to remove block L. It takes just a moment to solve the problem, 
based on a quick glance at the diagram. Now consider the same problem, but with 
the necessary information about the arrangement of the blocks given in predicate 
logic form: 


On(A, B) A On(B, C) A On(C, G) A On(G, M) A On(M, E) A On(M, I) A 
On(G, N) A On(N, I) A On(N, Q) A On(C, P) A OnC(P, S) A OnGS, Q) A 
On(S, K) A On(A, J) A OnV, F) A On(L, R) A On(R, K) A On(F, L) A 
On(R,O) A On(L, H) A On(A, O) A On(H, D) A On(B, F) A On(F, P) A 
VxVyVz[On(x, y) A Only, z) > Ons, z)] 
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Figure 2. What is the least number of blocks that must be removed to uncover block L? 


Which blocks must be removed in order to remove block L? Now the problem is 
much more difficult. And yet, in a very real sense, stating the problem in this 
formulaic fashion makes it informationally ‘simpler’, by providing just the pieces of 
information directly relevant to the problem, the information about which blocks 
rests on top of which other(s), together with the crucial transitivity condition on 
which the solution depends. Additional information given in the diagram, informa- 
tion not relevant to the solution, such as the lateral arrangement of the blocks 
(which ones are to the left /right of which others) is omitted. In fact, the diagram is 
cluttered with additional information irrelevant to the solution of the problem, 
whereas the formulaic representation provides just the information on which the 
solution depends. And yet, most people would say without hesitation that the 
problem is far simpler when presented as a diagram than when stated linguistically, 
using formulas. This simplicity is not informational simplicity, in the sense of the 
absence of spurious additional information; rather the diagramatic representation 
makes the problem simpler to the human mind because such a representation is 
ideally suited to visual processing, a task that the human visual-cognitive system 
performs with considerable ease. 

It is interesting to note that the situation is completely reversed for a computer. 
It is easy to program a computer to solve the problem when it is input in the 
formulaic form, but writing a program that allows a computer to process visual 
input (say, from a video camera) is a significant challenge that computer science 
has by no means fully solved. 

The use of diagrams may not only make the solution to a problem easier, it may 
also make it better. One way one proof can be ‘better’ than another is by providing 
greater insight into the problem. After all, the ultimate aim of a mathematical 
analysis is not ‘truth’ but understanding; truth is just one (very important) part of 
understanding. For example, consider the well known result that the sum of the 
first n odd integers is equal to n*. This can be expressed by means of the identity 

14+34+5+-+4+(2n -1) =n’. 
A straightforward induction proof verifies this identity for all n. But where is the 
insight? In what way does the proof increase our understanding of what is going 
on? 

For one thing, the very statement of the result as an algebraic identity is highly 
unnatural, failing to give explicit representation to all the features that make the 
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result interesting: that what are being summed are the first n odd integers, and 
that the result is the square of the number of odd integers being summed. The 
algebraic identity looks like just another curious, ‘accidental’ result of algebra; the 
expression in words might be a little more cumbersome, but at least it makes it 
clear that the result has definite interest, and expresses a ‘fundamental’ property 
of the integers. 

Turning to the proof, the standard induction proof provides little by way of 
insight, and certainly no clue as to how the identity might have been discovered in 
the first place. It simply confirms the truth of the identity. Now look at Figure 3. 
This diagram proves the result without any need for additional explanation. The 
eye at once sees the crucial pattern that yields the result. The additional complex- 
ity that arises from the geometric aspects of the diagram does not lead to a more 
difficult proof; on the contrary, the proof becomes transparent. And far more 
informational. 


Figure 3. A proof without words or formulas. 


From the perspective of the broad goal of understanding, rather than the far 
narrower one of establishing truth, there is a clear advantage of a transparent, 
explanatory proof such as the preceding one, over a technically complex but less 
illuminating argument. Of course, in many cases, there is little choice. Given that 
establishing mathematical truth is a significant part of mathematics, any proof is 
better than none. 

However, pedagogic advice is not the main focus of this article. Rather, [ am 
concerned here not so much with what we mathematicians should do but what we 
actually do. And what we do when we set about solving a problem is primarily a 
process of acquiring information. The establishment of truth may be the principal 
goal, but it is achieved by acquiring information about the problem. The role of 
truth in the construction of proofs is to motivate and guide the accumulation of the 
appropriate information. By concentrating on truth, indeed, by being defined in a 
truth-theoretic fashion, classical logic does not capture the process of (actual) 
proofs. Rather, what classical logic does, and does extremely well, is capture the 
outcomes of the proof process. 
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This distinction between truth and information becomes the more marked when 
mathematical analyses and arguments are carried out with the aid of a computer. 
Computers do not establish truth; they do not deal with truth. What computers do 
is store and process information. More precisely, computers store and manipulate 
representations of information—this distinction plays an important role in the 
analysis of reasoning I want to discuss. 


4. INFORMATION AND REPRESENTATION. In order to analyze the structure 
of mathematical arguments viewed as processes of accumulating information, we 
should first ensure we have a sufficiently clear understanding of what is meant by 
the word ‘information’. And here at once we have a problem. For, whereas we all 
probably have a reasonable intuitive conception of information, there is no 
mathematically precise definition, or at least no generally accepted definition. 
What there is—and what we will have to content ourselves with, at least 
initially—are a number of refined intuitions that can be captured in a moderately 
rigorous framework. 

The first intuition is that information (whatever it is) is not the same as the 
thing in the world that represents it. What information a person, animal, com- 
puter, or other kind of agent can acquire from some object or part of the world 
depends on a number of factors. 

For example, suppose you come across a tree stump in the forest. What 
information can you pick up from your find? Well, if you are aware of the 
relationship between the number of rings in a tree trunk and the age of the tree, 
the stump can provide you with the age of the tree when it was felled. To someone 
able to recognize various kinds of bark, the stump can provide the information as 
to what type of tree it was, its probable height, shape, leaf pattern, and so on. To 
someone else it could yield information about the weather the night before, or the 
kinds of insects or animals that live in the vicinity; and so on. It is not hard to think 
of further pieces of information that can be obtained from the stump. 

Likewise, your acquisition of information stored in books depends on your 
knowing the language in which the book is written. You probably need more. To 
profit from reading an advanced book on manifolds, for example, you need to start 
out with a considerable knowledge of the field. 

In general, information can be obtained from objects, scenes, events, environ- 
ments, various kinds of physical configuration (including books and other written 
materials), mathematical structures, and so forth. The single term situation may be 
used to refer to any of these information bearing entities, a word whose normal 
meaning accords very well with some of the entities that provide information, less 
well with others. 

In order to study mathematical reasoning that involves different forms of 
representation —formulas, words, diagrams, graphs, computers, and whatever—we 
need a formal machinery to handle the manner in which: 


(i) situation can represent information; 
(ii) a situation can represent different information; 
(iii) different situations can represent the same information; 
(iv) information can be represented in a distributed fashion, involving differ- 
ent representations in different situations; 
(v) information can be processed by manipulating representations. 


These are general issues that arise in any kind of information scenario. The only 
special feature in the case of mathematical reasoning is that the situations of 
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primary concern are mathematical structures and the various representational 
media involved. 

Notice that any framework that can handle these five issues must surely include 
a notion of information that is independent of representation. Thus, what we need 
is a formal apparatus to handle situations and suitable notions of ‘information’ and 
‘representation’. An appropriate theory was developed during the 1980s, starting 
with some ideas of Jon Barwise and John Perry, who called the new framework 
situation theory. The brief summary of this theory given in the following section is 
designed to provide a general sense of the theory at an intuitive level. The reader 
who wants to gain a usable understanding of the theory should consult [6] for a far 
more extensive coverage. 


5. SITUATION THEORY. Situation theory is a mathematical theory designed to 
provide a framework for the study of information. The theory takes its name from 
the mathematical device introduced in order to take account of context and 
partiality. A situation can be thought of as a limited part of reality. Such parts may 
have spatio-temporal extent, or they may be more ‘abstract,’ such as fictional 
worlds, contexts of utterance, problem domains, mathematical structures, 
databases, or Unix directories. 

In situation theory, information is always taken to be information about some 
situation, and is taken to be in the form of discrete items known as infons. These 
are of the form 


(CR, a,,...,4,,1)), ((R, a,,...,a,,0)) 


where R is an n-place relation and a,,..., a, are objects appropriate for R (often 
including spatial and/or temporal locations). These may be thought of as the 
informational item that objects a,,...,a, do, respectively, do not, stand in the 


relation R. The final constituent in an infon, 0 or 1, is called its polarity. 

Infons are ‘items of information’. They are not things that in themselves are true 
or false. Rather a particular item of information may be true or false about a 
situation. Though I have here introduced them as composite objects, built from 
relations and other objects, infons may also be defined as equivalence classes of 
representations under the equivalence relation of ‘informational equivalence’, 
where a representation is defined as an ordered pair consisting of a situation and a 
constraint. This idea is explained more fully in [6]. 

Infons may be combined in various ways to produce more complex informa- 
tional items known as compound infons, described momentarily. 

Given a situation, s, and an infon (or compound infon) a, we write 


SEoa 


to indicate that the infon o is ‘made factual by’ the situation s, or, to put it 
another way, that o is an item of information that is true of s. The official name 
for this relation is that s supports 0. The facticity claim s — o is referred to as a 
proposition. 

If s is a situation and > is a finite set of infons, we write 


sE> 


to mean that s F o for all infons o € &. 

There are several ways that infons may be combined to form compound infons. 
We need two particularly simple ways for this account. 

First of all, we can form the conjunction, 0 A tT, of two infons (or of two 
compound infons), 0, 7. The informational meaning of the conjunction operation 
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is fairly clear: the compound infon o A 7 is the informational ‘item’ that comprises 
both o and r. 
By definition, for any situation, s, we have 


sEoaoAdT ifandonlyif s&o and sk rt. 


Likewise, the disjunction of two infons (or compound infons), a, 7, is a 
compound infon, o0 V 7, such that for any situation, s, 


sEoVd7 ifandonlyif s=&o or stv (orboth). 


The term constraint refers to the abstract mechanism by which a situation can 
encode, represent, and yield information and a person or agent can extract that 
information. Constraints may be natural laws, conventions, analytic rules, linguistic 
rules, empirical, law-like correspondences, or whatever. 

Constraints operate at what situation theorists call the ‘type level’. I can 
illustrate types and constraints by means of a simple example. 

Suppose you are driving a car in an unfamiliar city and you come to an 
intersection controlled by traffic lights. You have never before been at that 
particular intersection, and have never before seen those particular traffic lights. 
Nevertheless, you know how to behave: if the lights are red, you stop, if they are 
green, you proceed. Your behavior cannot be based purely on the circumstances of 
the particular situation you find yourself in. That is a brand new situation you have 
never encountered before. Rather the situation is of a type you are very familiar 
with. You know how to behave in any intersection-controlled-by-traffic-lights 
situation. Your behavior is determined by the situation type. You are familiar with 
a constraint that links one type of situation, the type where the light is at red, with 
another type of situation, the type where you stop the car. In this case, the 
constraint is codified into the traffic laws, but many constraints are not. 

For an example of a constraint that is not part of our system of laws, suppose 
you look up and see dark clouds in the sky. You say to yourself, “It looks like it 
might rain today.” On the basis of one situation, the cloudy sky right now, you infer 
information about another, future situation, the weather in that region later in the 
day. The basis for this inference is that you are aware of a systematic relation (a 
constraint) between skies of a certain type and subsequent weather of a certain 
type. When the sky is of the type “‘darkclouds’, it is often the case that the weather 
is subsequently of the type ‘raining’. The actual dark cloud formation that you see 
on that particular occasion does not, in itself, tell you anything. It is a one-off 
event. It is by virtue of the sky being of a recognizable type that you can obtain the 
information you do. If we were not able to recognize types, the world would always 
be presented to us anew, and we would be unable to make any reliable inferences 
based on prior knowledge or past experiences. 

The ability to recognize types of things lies at the basis of much of human 
cognition and communication. Humans are type recognizers. So too, it appears, are 
various animals—bees seem able to recognize the types of certain flowers, and pet 
cats and dogs seem able to recognize the type of feeding bowls and the type of 
doorways. 

Many of the words in our language refer to types: types of object, types of 
action, etc. For example, nouns that denote things do so by making reference to 
types: ‘house’ refers to any house, not one particular house; ‘car’ refers to any car, 
not one particular car; ‘mountain’ refers to any mountain, regardless of its exact 
shape or size or location; and so on. Similarly for verbs: ‘walk’ refers to any 
walking action by any person or legged animal; ‘run’ refers to any running action; 
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‘climb’ refers to any climbing action; etc. Such nouns and verbs can be used on any 
particular occasion to refer to a particular thing or action, but such reference is 
only possible because their ‘meaning’ refers to types of things or actions. 

Situation theorists reify such common features of situations as situation types. 
The ‘type’ of all traffic-light-at-red situations is what all those situations have in 
common—whatever that is. This sounds a bit mysterious, and it is. But it is no 
more mysterious than the counting numbers that we learn to use when we are two 
or three years old, 1, 2, 3, etc. When you ask what the number 5 really ‘is’, the only 
answer is that it is what all collections of five objects have in common, five apples, 
five oranges, five elephants, five languages, etc. 

Likewise, situation theorists define constraints in a formal fashion as relations 
between pairs of situation types. 

For a mathematical example, groups constitute a type. The integers under 
addition and the nonzero rationals under multiplication are two situations of the 
type ‘group’. Whenever you learn that a particular mathematical structure (ie., a 
particular situation) is of the type ‘group’, you immediately know a lot about the 
properties of that situation—even if it is a situation (i.e., a structure) you have 
never encountered before. (Just as for the traffic lights example.) 

The eventually successful search for a classification of the finite simple groups 
can be regarded as an attempt to establish a constraint to the effect that any 
structure (situation) of the type ‘finite simple group’ must be of a type among a 
certain collection of other types (type of ‘cyclic group’, type of ‘alternating group’, 
etc.) 

The machinery of situations, types, and constraints allows us to analyze informa- 
tion. Given an appropriate constraint, anything can be used to store information. A 
knowledge or an awareness of the relevant constraint, or an adaptation to it, is 
what enables a person to acquire the information represented by way of that 
constraint. Constraints also facilitate reasoning and communication, including, of 
course, reasoning and communication in mathematics. Inference and deduction are 
activities whereby certain facts (items of information) about some situation are 
used to extract additional information that is in some sense implicit in those initial 
facts. For example, familiarity with the constraint that smoke comes from fire 
enables a person or an animal to infer the presence of fire from the appearance of 
smoke. 

From the perspective of situation theory, ‘logic’ is the study of certain kinds of 
constraints, with classical, first-order, predicate logic being the study of those 
particular constraints established by the axioms of first-order logic. 

Notice that the situation-theoretic conception of deduction makes no mention 
of language. This is quite different from the case of classical logic, in which 
language (i.e., first-order predicate language) plays a pre-eminent role. Language 
remains an important means of performing inference, but it is by no means 
all-embracing. People make non-linguistic inferences every day. As you prepare to 
leave for work in the morning you notice the dark clouds in the sky and reach for 
your raincoat, having (correctly) inferred that (a) there is a strong possibility of 
rain, and (b) if you do not take your raincoat you may get wet. No use of language 
is involved in making these inferences (except subsequently, perhaps, should you 
have cause to reflect upon why you acted as you did). Indeed, similar kinds of 
inference are made routinely all the time by animals and organisms that do not 
have any linguistic capacities. 

Situation theory provides a rich mechanism to generate types and construct 
typed parameters. For the present purposes, we need just one type-formation 


1997] MATHEMATICAL REASONING 641 


operation. Given a situation parameter § (dots generally denote parameters) and a 
compound infon a, there is a corresponding situation-type 
[slsE ao] 
the type of situation in which o obtains. For example, 
[ sls = (run, Max, 1))] 
denotes the type of situation in which Max is running. 

As we have observed, constraints are binary relations that link two types. The 
class of constraints that we need for our present analysis of mathematical reason- 
ing are all of the form 

S=> S' 
where S and S’ are situation types. This constraint says that whenever there is a 
situation s of type S, there is a related situation s’ of type S’. For many 
constraints, s’ is either the same as s or else an extension of s. This is particularly 
true of many mathematical theorems, which are of the form: if s is a structure (i.e., 
a situation) of type S, then s is a structure of type S’. 


6. THE BLOCKS PROBLEM. As an illustration of how situation theory may be 
used to analyze the solution to a mathematical problem, consider the stacked 
blocks problem introduced in Section 3. Let s be the situation represented by the 
diagram in Figure 2. It does not matter in this case whether s is an actual stack of 
blocks or an imaginary one; in fact, one can even let s be the diagram itself. The 
point is, s is extremely rich in information, most of which is logically—but perhaps 
not cognitively—irrelevant to the problem of determining how many blocks must 
be removed in order to uncover block L. 

The solution to the problem depends (logically) upon the following two proposi- 
tions: 


1. s F <<On, A, B,1>) A <<On A, J,1>> A <<On, B, F,1>) A 
(<On, J, F,1)>)> A (<On, F, L, 1>> 
2. M = [s|s — (<On, x, y,1)) A <<On, y, z, 1>)>] => [s|s & (On, x, Z,1))] 


The first proposition provides the minimal stacking information required to 
solve the problem. In the second, .@ denotes the framework of elementary 
mathematical reasoning. Notice that the situation .@ has not been precisely 
defined—at least not by listing all of the rules mathematicians use in solving 
problems. In terms of describing the logical solution to the blocks problem, there is 
no need to do so. Proposition 2 supplies the crucial rule of logic required to solve 
this particular problem. 

It can of course be argued that the solution requires additional rules. But such 
an observation applies to any argument or proof. There is, quite literally, no end 
to the degree of refinement and ‘further detail’ that can be asked for. Remember, 
in their mammoth work Principia Mathematica, it took Whitehead and Russell 362 
pages of strict, formal logical development before they were able to prove 1 + 
1 = 2. And it is still possible to ask for more detail than they supplied. 

One of the features of situation theory that makes it particularly suited to 
modeling mathematical reasoning is that its formalism can represent both the mass 
of general background rules and know-how that are required in order to carry out 
any argument (or any other cognitive task) and the particular rules and facts that 
are directly pertinent to the argument. The general background is captured by a 
situation. The pertinent rules and facts in that background are captured by infons 
supported by the background situation. Just which rules are regarded as pertinent 
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is a matter for the mathematician to decide—what passes for an adequate level of 
detail for the experienced, professional mathematician differs from what is re- 
quired by a novice. Situation theory does not resolve the issue of how fine the 
analysis should be. (To some extent, classical logic does determine the level of 
analysis. It is because that level is far finer than most mathematicians are able to 
tolerate—apart perhaps from Whitehead and Russell—that mathematicians never 
write up their proofs in predicate logic.) Moreover, situation theory—indeed any 
other mathematical theory—does not have anything to say about how people come 
to recognize in a diagram or figure the crucial information needed to solve a 
particular problem; that is an issue for the cognitive psychologist to investigate. 
What the situation-theoretic analysis does tell you in the case of the blocks 
example is that the solution depends on a mixture of visual reasoning and abstract, 
logical deduction. The visual reasoning involves the situation s, and yields the 
pertinent information provided by the proposition 1. The abstract reasoning 
involved is based most significantly on the constraint in proposition 2. 

So far, our analysis of the solution to the blocks problem has simply identified 
and isolated the crucial information—though it should be remembered that we are 
regarding mathematical arguments as consisting primarily of the acquisition of 
appropriate information, so the identification of the relevant information obtained 
at each stage of the argument is the most significant feature of our analysis. 
Suppose we wanted to take the analysis a stage further, and see how proposition 1 
arises. 

In my own case, I solve the blocks problem as follows. I first identify the target 
block L. I then restrict my visual attention to the subsituation s’ of s that 
comprises all of the blocks that seem to be ‘above’ block L. I do not decide exactly 
which blocks are in the subsituation s’ and which are not. At this stage, that level 
of precision is irrelevant to the solution. The vague, unspecified description of s’ I 
just gave is all that I require. Notice that the same is true of the entire stack 
situation s. The solution does not require that you identify every single block. For 
instance, do you know, without looking, where block Q is located? Is there a block 
labeled Z? Both of these items of information would be part of a complete, 
extensional description of the stack, but they are not required in order to solve the 
problem given. The important feature of s’ as far as the solution is concerned is 
that: 


3. s’ — (On, A, B,1>> A (<On, A, J,1>) A <<On, B, F,1>) A 
((On, J, F,1>> A <<On, F, L,1>) 


This is proposition 1 with s’ in place of s. Proposition 3 implies proposition 1 by 
virtue of a fundamental fact about infons known as persistence. This property says 
that if u is a situation and o is an infon, and if u & o, then for any situation w’ 
that extends u, u’ Ho. The property of persistence has deep consequences 
concerning the structure of relations and the requirements that have to be satisfied 
by infons. See [6] for details. 

Notice that the restriction of attention from the entire stack s to the substack s’ 
is something the human visual-cognitive system does directly and with ease. Such 
aspects of visual reasoning cannot be adequately captured by traditional logical 
formalisms because they are extensionally imprecise. Their enormous efficiency 
comes in large part from the fact that they do not attempt to resolve issues that are 
ultimately irrelevant to the solution. 

By augmenting proposition 1 with proposition 3, we obtain a more fine-grained 
analysis of the solution to the problem. Notice that the constraint in proposition 2 
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can be applied either to situation s or to situation s’. In my solution to the 
problem, I never try to apply the constraint to the large situation s. I do however 
apply it to the much smaller situation s’, where there are far fewer blocks to 
consider. 

At this stage, in my case, I am able to solve the problem by inspection. I 
certainly do not resort to a ‘search’. Though I never bother to calculate its exact 
size, situation s’ certainly contains no more than seven blocks, and it is a well 
established fact of cognitive psychology (first observed by the psychologist George 
Miller in 1956) that the human visual-cognitive system is able to process structural 
and quantitative information about any collection of seven or fewer objects. In my 
case, the key step in the solution to the problem is the restriction of attention to 
the substack s’, which is small enough that I can avoid having to resort to a search 
procedure. Situation theory is able to model this key step in a way that is not 
possible with classical logic or other systems that require complete extensional 
precision. 

Of course, it would be possible to continue to refine the analysis. But from the 
point of view of illustration, it should now be reasonably clear how situation theory 
is used in such an analysis. 


7. THE STRUCTURE OF COMPUTER-AIDED PROOFS. The example consid- 
ered in the previous section did not involve any use of a computer, but it illustrated 
the key feature of computer-aided reasoning: the use of more than one means to 
represent and manipulate information. When a mathematician makes use of a 
modern computer system for doing mathematics, such as Mathematica or Maple or 
perhaps some more specialized package, not only is the process very clearly one of 
acquiring and processing information (rather than establishing truths), but the 
reasoning can be very different from mathematical reasoning performed without 
such aids. 

Of course, the use of the computer could be fairly superficial, say just to draw 
an initial diagram to illustrate a traditional, pencil-and-paper type proof, or to 
perform some arithmetical calculations. Though mathematical arguments carried 
out in this fashion may still be analyzed using the techniques outlined in this 
article, | am more interested in reasoning processes in which the computer is used 
in a significant way, as an integral part of the reasoning process. 

For example, a fast, interactive graphics facility can enable the mathematician 
to draw not just one but a whole series of graphs or diagrams, to enlarge part of a 
graph and zoom in on regions of particular interest, to rotate or reflect a surface, 
to set up a short ‘movie’ that illustrates the effect of a particular parameter change, 
and so forth. These are usually done in conjunction with more traditional forms of 
reasoning. A mathematician skilled in the use of such systems is able to move from 
one medium to another in a smooth, effortless fashion, going from pencil and 
paper to computer screen and back to pencil and paper again. These changes in 
representational /reasoning medium do not change the problem; there remains a 
single problem, and the entire process results in an accumulation of a single body 
of information about that problem. What changes during the course of a proof is 
the medium in which the informational is represented and processed; first one, 
then another, then another, sometimes a mixture of two or more representations at 
the same time. Except for the dynamic aspect of computer representations, this is 
precisely the feature of reasoning with a diagram that was investigated in the 
previous section. Situation theory is well-equipped to handle dynamic issues, since 
time can be one of the arguments in an infon. 
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Classical logic and natural deduction have no mechanism to handle multi-repre- 
sentational reasoning, since at most they can capture what is represented, not how 
it is represented. And, of course, one enormously significant feature of any kind of 
diagramatic reasoning process is that the diagram contains far more information 
than is logically required for the argument, which is something else that classical 
logic cannot reflect but situation theory can handle with ease. 

When a mathematician starts to solve a problem, the data given establish both a 
problem domain—perhaps a certain kind of mathematical structure such as a 
group or the real number system, or maybe a stack of blocks or an arrangement of 
rods on a table—and some initial information about that problem domain. We 
shall represent the problem domain as a situation, say d, and the given data by a 
finite set > of infons. The starting point thus constitutes a finite set of propositions 
dt (ie. do for every o in &). In these terms, the goal may be to 
demonstrate that d satisfies some particular property 7, which amounts to estab- 
lishing the proposition d — 7 or the goal may be to show that there is some further 
situation e having a certain property, e FE 7 We then model the reasoning process 
as the establishment of a series of sets of propositions d, F %,, d, - Xy,..., 
d, — &, where d, = d, }, = &, and the final proposition is the goal (so &,, = {a,} 
for a single compound infon o,,). The situation under consideration at each stage 
in this process is called a case. If the compound infons in the set >, (at stage i of 
the deduction) are contradictory, we say the case d, is terminal. Any nonterminal 
case that is subsumed (see presently) by one or more later cases is said to be 
closed; any other nonterminal case is said to be open. 

In carrying out the argument, any stage d, E %, can be advanced in one of two 
ways: either by acquiring additional information about the situation d, (i.e., adding 
further infons to the set &,) or else by turning attention to one or more different 
situations d;, d;, etc., perhaps by breaking d, into subcases. 

Another way to advance the proof is to take a number of stages d; F %,,d; F 
>;,d, = X,,... and subsume them by a single new stage d, F &, that manages to 
combine all the information in the stages subsumed. This is what is done when we 
have exhausted all cases in a ‘proof by cases’ argument. 

Notice that the linear order of the cases and propositions in an argument 
primarily reflects the fact that the reasoning takes place in time. The logical 
structure need not be linear; most commonly it takes the form of a cycle-free 
directed graph. 

Barwise and Etchemendy [1] have suggested that any piece of mathematical 
reasoning involves just five distinct principles, which I list below. They used these 
principles to develop an educational software system called Hyperproof [2], de- 
signed to teach logical reasoning using a combination of ordinary language, 
mathematical formulas, and diagrams. They refer to such reasoning as heteroge- 
neous reasoning. Here are Barwise and Etchemendy’s five principles of heteroge- 
neous reasoning: 


Given. Accept some initial information as given. This step gives rise to the 
initial case. 

Assume. Given some open case d, assume something extra, thereby creating an 
open subcase of d. A typical use of this principle is to assume the antecedent 
of a conditional with the goal of deducing the consequent, thereby establish- 
ing the conditional. 

Subsume. Disregard some open case if it is subsumed by all other open cases. 
For instance, this step is performed when all the information in a case is 
exhausted by its subcases. 
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Merge. Take the information common to a number of open cases, and call it a 
new open case. This principle would typically apply after a case has been 
broken into a number of subcases and a desired result established in each 
subcase. 

Recognize as Possible. Given some open case, recognize it as representing a 
genuine possibility. This form of reasoning is typically used when a coun- 
terexample is accepted as showing that some result does not follow from the 
given information. 


Classical logic is largely restricted to a combination of assume and subsume. 

It should be stressed again that the situation-theoretic approach just outlined 
does not attempt to describe how a person comes to make the various decisions 
involved in constructing an argument—which of the five steps listed above to take, 
which of the available representations to use at any stage, and so forth. Situation 
theory does not even claim to provide a model of the actual process of deriving an 
argument—it is hard to imagine any mathematical theory being able to model 
mental processes of that nature. What a situation-theoretic analysis does model is 
the outcome at each stage, the outcome not in terms of truth, rather the informa- 
tional outcome. But notice that steps such as deciding what kind of move to make 
next and choosing the representation are crucial and integral parts of the solution 
process. By taking account of (without modeling) the actual human reasoning 
process, situation theory does, however, add a degree of mathematical precision to 
a description of actual mathematical reasoning. It does not, of course, produce a 
totally formal account on the level of mathematical logic. As such, the application 
of situation theory described in this paper constitutes an instance of what in [7] I 
refer to as ‘soft mathematics’, a use of mathematical ideas and formalisms common 
in the human and social sciences, and, if Casti [4] is right, about to become 
common in applied mathematics as well. 
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The Wallet Paradox 


Kent G. Merryfield, Ngo Viet, and Saleem Watson 


1. Introduction. In his book aha! Gotcha [1] Martin Gardner gives an intriguing 
“paradox” involving money and wallets. We found that an analysis of this paradox 
can serve as an interesting way of utilizing some key concepts in probability. The 
paradox as related by Gardner is as follows: 


Each of two persons places his wallet on the table. Whoever has the smallest amount of money 
in his wallet, wins all the money in the other wallet. Each of the players reason as follows: “I 
may lose what I have but I may also win more than I have. So the game is to my advantage.” 


Paradoxically, it seems that the game is to the advantage of both players. Of course 
if one player always carries a larger amount of money than the other player, then 
he always loses. So we must require that the game be “fair” in some sense. In his 
analysis of the problem Kraitchik [2] assumes that the amount of money each 
person carries is uniformly (discretely) distributed between 0 and 100. He then 
makes a chart of the distribution of money of both players and observes that the 
distribution is symmetric (with respect to the diagonal) and concludes that there is 
no advantage. This explanation is considered unsatisfactory by Gardner since it 
does not explain what is wrong with the reasoning of the players. Indeed, 
Kraitchik’s chart gives a particular example where the game is not to the advantage 
of either player, but does not address the source of the paradox. In this article we 
explore the concept of a fair game and in the process we shall resolve the paradox. 


2. What are the random variables? A player says “J may lose what I have but I 
may also win more than J have.” This is a true statement for any single trial of the 
game. However, the inference that “the game is to my advantage” is the source of 
the apparent paradox, because it does not take into account the probabilities of 
winning and losing. In other words, if the game is played many times, how often 
does a player win? How often does he lose? And by how much? Indeed, by 
considering many trials of this game, the enthusiasm of the players for winning 
should be tempered by the observation that when one loses one typically has more 
money in one’s wallet. 

To analyze this game probabilistically we need to know what are the relevant 
random variables and what are their probability distributions [3]. We are interested 
in the probability distributions of W, and W,, the amount of money that player A 
and B will win (or lose), respectively. We say that the game is fair if the expected 
value E(W,) = 0 (equivalently E(W,,) = 0). To understand W, and Wy, let X and 
Y be the random variables representing the amount of money in the wallets of 
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players A and B, respectively. According to the rules of the game W, is given by 


-X, ifX>Y 
WA(X,Y)=4 Y, ifX<Y 
0, ifX=Y 


and W,(X,Y) = —W,(X,Y). These expressions make it difficult to calculate 
E(W,,) and E(W,), since they depend on both of the distributions of X and Y ina 
nontrivial way. This is apparently why no quick and simple way of resolving the 
paradox is available. 

We now consider models of the game that intuitively seem fair. 


3. A fair game: independent identically distributed X and Y. Perhaps the most 
natural model for this game is one in which the distributions of the money in each 
player’s wallet are the same; that is, X and Y are independent, identically 
distributed random variables on some interval [a, b] or[a,~),0 <a <b < ~, Thus 
the joint distributions (X,Y) and (Y,X) have the same density function f 
satisfying f(x, y) = fly, x). Now, observing that W,(Y, X) = W,CX,Y) and ex- 
ploiting the symmetry in the problem we have 


E(Ws) = [of Wal y) f(x 9) dvdr 


b 


= ff Wal. 2) f(y, 2) dyde = J [Wal x.y) f(x, y) dyde = E(W,) 


where we have made the change of variables (x, y) — (y, x), whose Jacobian is 1. 
This combined with the observation that W,(X,Y) = —W,(X,Y) shows that 
E(W,,) = 0; and so, by our definition, this is a fair game. 

As a concrete example, suppose X and Y are jointly uniformly distributed on 
the unit square [0, 1] x [0,1]. The probability that player A wins y dollars is 1 — x. 
In that case y € (x, 1] with mean equal to (1 + x)/2. Player A loses x dollars with 
probability x. Given that player A carries x dollars in his wallet, the conditional 
expectation of the amount of money that he will win is 


1+x , - 3, 
E(W,\|X =x) = [= Ja —a) X= Fm OH 


Thus the expected value for W, is 


1 if1l 3 

E(W,) = [ E(WslX =x) de = i [; - 5] dx = 0. 
It is imteresting to consider special cases of this formula for the conditional 
expectation. Since E(W,|X = 1) = —1 and E(W,|X = 0) = 1/2 we see that a 
player carrying one dollar in his wallet should expect to lose it, whereas a player 
carrying nothing in his wallet should expect to gain half a dollar (the mean). 
Interestingly, if a player is carrying half a dollar (the mean) in his wallet, then 
E(W,|X = 1/2) = 1/8; that is, his expectation of winning is positive. 


4. “The game is not to my advantage”. It may be tempting to think that the game 
would be fair if we require only that the distributions X and Y have the same 
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mean. But this is not always the case, as we now show. Suppose that X and Y have 
the joint distribution shown in the following chart. 


For player A, the marginal distribution is p(0) = 4/6 and p(3/2) = 2/6 and for 
player B, the marginal distribution is p(0) = 3/6 and p(1) = 3/6. The mean for 
player A is my = (0 X 4/6) + (3/2) X (2/6)) = 1/2. Similarly the mean for 
player B is mz = (0 X 3/6) + (1 X 3/6) = 1/2. But the expected value of player 
A’s winnings is 
EW 0 2 1 2 3 1 3 «1 1 
=Q0xX-—-+1x—--=+x-—--=x-+=-- 
(Ma) 6 6 2 6 2° 6 6 
This shows that the game is to the advantage of player B. 

It turns out that even a smaller mean does not guarantee an advantage in this 
game. Indeed, replacing 3/2 in the chart by any number in the interval (1, 3/2) 
yields an example where player A has a smaller mean than that of B. However, 
player A is still at a disadvantage (that is, E(W,) < 0). 


5. Conclusion. The concept of a fair game has to do with repeated trials (and not 
with any single trial) of a game. So the wallet game is properly understood in the 
context of the probability distributions of the money in the wallets and the 
expected values of winning for each player. We have shown that the game is fair if 
reasonable assumptions are made on these probability distributions (Sections 3) 
whereas the game is not fair with other assumptions on these distributions (Section 
4). Moreover, our analysis may be used to determine whether the game is fair for 
any given pair of distributions. So in the context of probability, the paradox is 
resolved. 

Some interesting questions remain unanswered about this problem. For in- 
stance, if we suppose that the distributions of players A and B are required to have 
the same means, is there a strategy that player A could adopt to have a winning 
edge? In other words, is there a preferred distribution (or a winning strategy)? 
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The Weierstrass Approximation Theorem 
and Large Deviations 


Henryk Gzyl and José Luis Palacios 


Bernstein’s proof (1912) of the Weierstrass approximation theorem, which states 
that the set of real polynomials over [0, 1] is dense in the space of all continuous 
real functions on [0, 1], is a classic application of probability theory to real analysis 
that finds its way into many textbooks ((1] and [2]) and journals [3]. All that is 
invoked in Bernstein’s proof (at least as presented in [1] and [3]) is Chebyschev’s 
inequality, and if the argument is applied to a function satisfying a Lipschitz 
condition, the rate of convergence of the Bernstein polynomials to the function can 
be shown to be at least of order 1/n'/*. If instead of Chebyschev’s inequality we 
use another probabilistic tool very much in vogue nowadays, the theory of large 
deviations, we can prove that the rate of convergence is at least of order 
In’? n/n'/?. All the material used here concerning large deviations is elementary 
and can be found in [1]. 

Let f be a real function on [0, 1] that satisfies a Lipschitz condition, i.e., there is 
a constant C such that for all x, y € [0, 1] 


lf(x) —f(y)| < Clx — y|. 


Then we have the following: 


Theorem 1 (Weierstrass approximation theorem). For f satisfying a Lipschitz condi- 
tion, there is a sequence of polynomials p,(x), where the degree of p,(x) isn, and a 
constant K, which depends on f, such that 


1/2 p 
Ip, — fll < Ka. 
n 
Here || || denotes the sup norm. Since the function f is Lipschitz, it is uniformly 
continuous and bounded by a constant, say, M so that ||f|| < MM. In order to prove 


the theorem we need the following lemma, taken almost verbatim from [1], 
(Corollary A.7) and included for completeness: 


Lemma 1. For a binomial random variable B(n, x) with n independent trials and 
probability of success x for each of them, and a > 0 arbitrary, we have 


P(|B(n, x) — nx| >a) < QeW 2a /n, 


Proof: Let X,,...,X, be independent and identically distributed random vari- 
ables with 


P(X,=1-x) =x, 
P(X, = -x) =1-x, 
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and let X = X, + -:: +X,. Clearly X has distribution B(n, x) — nx. Since X is 
symmetric, it is enough to prove the one-sided inequality 


P(X >a) <e?"/", (1) 
Step 1. For all reals a, B with |a| < 1, we have 


cosh( B) + a sinh( B) < e? +28, (2) 


Proof. This is immediate if a = 1 or a = —1 or | B| => 100. If (2) were false, the 
function 


f(a, B) = cosh( B) + a sinh( B) — e8 7448 
would assume a negative global minimum in the interior of the rectangle 
R = {(a, B):|a| < 1,| B| < 100}. 
Setting partial derivatives equal to 0, we find 
sinh( B) + a cosh( B) = (a+ B)e® 2*28, 
sinh( B) = Be® +28, 


and thus tanh B = B, which implies B = 0. But f(a,0) = 0 for all a, a contradic- 
tion. 


Step 2. For all 6 € [0,1] and all 4, 
Ge) 4 (1 — B)eW*? < e*¥ 8, (3) 
Proof. Setting 6 = (1 + a)/2 and A = 28, (3) reduces to (2). 


Step 3. Let, for the moment, A > 0 be arbitrary and let E[-] denote “expected 
value.” Then 


Ele] =xe™™) + (1 —x)e* <e” 


by Step 2. Thus 
E[e*] = [TJ E[e**] <e*"”. 
i=1 
Applying Markov’s inequality, 


[e“"] 
P(X >a) = P(e* > e**) < —— Ker", 
e 
We set A = 4a/n to optimize the inequality: P(X > a) < e-2°/" asclaimed. M 


Proof of the theorem. Define the Bernstein polynomials 


px) = 2 ("1a -9"'4(<} 


i=0 
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Then, since Dro ("je —x)"~' = 1, and because of the lemma, we have 


pels E (A¥a-9" (5) se 


n i n-i i 
+E (a-9 (= + IFC) 
aC 
<—_ + 2MP(|B(n, x) — nx| > a) 
ac ; 
< — + 2Me 2" /", (4) 


Now we optimize the parameter a in terms of n. We consider the function 


ac aa? 
F(a) = — +2Me~**/", 
n 


set 
C 8Ma ; 
0=F'(a) = — - ean 
n n 
and get the exponential equation 
C 
ae~2%°/n = 8M (5) 


While a precise solution for this exponential equation is unavailable, we are led to 
the asymptotic solution 


1 
a= ane In!/? n. 


Replacing now in (4) a = 1/2n'” In'”* n, we obtain 


C 2M \1n'/*n 
— < — + —_——_—__ — , 
Pr -f 2 vyinn}] ni? 
so K = C/2 + 2M works for n = 3. | 


The classic probabilistic proof of the Weierstrass approximation theorem, when 
applied to a Lipschitz function, yields instead of equation (4) the expression 
G ac n ‘ 
= — + — , 

(a) n - (6) 
where the first summand follows from the Lipschitz condition and the second is 
due to Chebyshev’s inequality. Optimizing G(a) (a much easier task than optimiz- 
ing F(a) above) yields a ~ n?/° and therefore, inserting a = n?/* into (6) (this is 
the choice of a in [1], by the way) yields 


1 
Ip, — fll < G(n’”) = o( rs} 


a weaker result, which justifies the effort of using the large deviation inequality. 
How does our rate of convergence compare with the ones found in standard 
textbooks on approximation theory? In [4], for instance, it is mentioned that if 
f(x) =x? and p,(x) is the Bernstein polynomial for this function, then || p, — f|| = 
1/4n. In general, this rate cannot be expected for all functions, and it is an 
exercise in [4] to prove that if f is twice continuously differentiable, then the error 
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of the approximation satisfies the bound 1 
__ < — i . 
Pn ~flls Sif | 


This rate is better than ours, but the assumption that f be twice continuously 
differentiable is much more restrictive than our Lipschitz condition. 
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From the MONTHLY, Volume 4, 1897: 


| 
i. Brief Introduction to the Infinitesimal Calculus. Designed Espe- | 
| cially to Aid in Reading Mathematical Economics and Statics. By | 
| Irving Fisher, Ph. I)., Assistant Professor of Political Science in Yale | 
| University, Co- author of Phillips and Fisher’s Elements of Geome- | 
| try. 12mo. Cloth. 84 pages. Price, 75 cents. York and London: The | 
| Macmillan Co. | 
| This little work on the Calculus will be received with joy by : a great army. of | 
| students, teachers, and professors, who. have lacked the time and courage to attack | 
| some of the more exhaustive works on the subject yet felt the need of a knowledge | 
| of the Calculus in order to enable them to read with intelligence the highest | 
| authorities on economic ‘as well as other subjects. Dr. Fisher has prepared this | 
| little work with a, special view of the needs of this class of students. Anyonewitha | 
| _ clear mind can very eagily read and understand every. sentence in this book. There | 
| is no metaphysical speculation nor obscure statements made i in establishing its first | 
| 

| 
| ¢ | 
| 
| 
| | 
| | 
| 
| | 
| | 
| | 


principles. pp, 261-262 
65, Proposed by: GEORGE LILLEY,’ Ph. D., LL.D., -Portland, 
Oregon. | 


A string is. wound spitally 100° times ‘around: ‘d cond 100 feet high: and:2 feet in 
diameter at the base. Through: what distance. ‘will'a duck’ swim in unwinding the 
string keeping it taut at all times, 'the:cone standing on its’ biisé and at right angles 
to thé surface of thé water? - ~p. 62 


66. Proposed’ by “J. K.. ELLWOOD,,.-A.M., Principal of Colfax 
School, Pittsburgh, Pennsylvania 


Around the'top of a cf nical frustum—base 5 feet, top.1 foot, altitude 100 fect—is 
wound a rope 100 feet. -long and 1 inch thick. It is unwound: by a hawk flying in one 
plane. How far does Mr. Hawk fly? p. 62 
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THE EVOLUTION OF... 


Edited by Abe Shenitzer 
Mathematics, York University, North York, Ontario M3J 1P3, Canada 


On the Historical Development 
of Infinitesimal Mathematics 
translated by Abe Shenitzer with the editorial assistance of Hardy Grant 


Detlef Laugwitz 


PART II. THE CONCEPTUAL THINKING OF CAUCHY! 


5. CAUCHY 1821: CONTINUITY AND CONVERGENCE. Next to Fourier and 
Simeon Denis Poisson (1781-1840), Augustin-Louis Cauchy (1789-1857) was one 
of the most prolific workers in the area of the new mathematical physics. Begin- 
ning in 1816, he taught at the Ecole Polytechnique. From its establishment in 1795, 
this school provided a demanding basic course of study in mathematics that was to 
become a model for university education. There were no usable textbooks (apart 
from the one by Lacroix), and Cauchy had to write outlines for his lectures. Far 
from being learning aids for the students, they were outlines of the foundations of 
analysis. Like so many textbook writers after him, Cauchy thought more about 
scientific research than about the teaching of beginners. But this fact resulted in a 
fundamental reshaping of analysis. Cauchy initiated the elimination of algorithmic 
thinking by replacing it with conceptual thinking. 

Functions were no longer given by expressions. In his Cours d’analyse of 1821, 
Chapter I on Real Functions opens with the following sentence: “If variable 
magnitudes are so interrelated that prescribing the value of one of them makes it 
possible to infer the values of all the others, then we ordinarily think of these 
magnitudes as expressed in terms of one of them, which then bears the name of an 
independent variable; and the magnitudes expressed in terms of the independent 
variable are what one calls functions of this variable.” And Chapter II begins with 
the sentence: “One says that a variable magnitude becomes infinitely small if its 
absolute value decreases indefinitely, so that it converges to the limit zero.” 

These descriptions cannot be regarded as definitions from which one can 
deduce theorems. Rather, these are clarifications of usage. It is hard to tell 
whether or not Cauchy distances himself from the conception of a function as an 
“analytical expression.” What counts is the way one operates, and this can be seen 
in the context of continuity of functions. Inferences are drawn from the conceptual 
property and not from the representation by means of a particular expression. 


‘In a recent MONTHLY article [103 (1996) 846-853] Roman Kossak tries to explain “what are 
infinitesimals and why they cannot be seen.” I hope readers of my paper realize that Cauchy knew very 
well what infinitesimals are and that they can be seen. 
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Continuity of a function f(x), given on an interval, is defined in para. 2 of 
Chapter II: Let a be an infinitely small increment of x. Then f(x) is said to be 
continuous in this interval “if the absolute value of f(x + a) — f(x) decreases 
indefinitely together with that of a. In other words, the function f(x) is continu- 
ous with respect to x within the given bounds if, within these bounds, an infinitely 
small increment of the variable always results in an infinitely small increase of the 
function itself.” Cauchy italicizes this alternative definition.” 

One gets the impression that “infinitely small” is just an abbreviation for 
“tending to the limit 0.” But one must check how Cauchy works with this notion. 
And so we look at his Theorem I on continuous functions: If f(x, y) is continuous 
with respect to each of the variables x and y in the vicinity of (xy, y,), then f(x, y) 
has the limit f(x,, y,) if x and y converge to x, and y, respectively. 

The statement says nothing about infinitely small magnitudes, and there are 
counterexamples such as f(x, y) = xy /(x? + y”) for (x, y) # (0,0), and f(0,0) = 0. 
Here f is continuous with respect to y for every fixed real x and continuous with 
respect to x for every fixed y. But if we put x = y = ¢ and let ¢ tend to 0, then we 
obtain 1/2 # 0. How does Cauchy prove this obviously false theorem? Let a and 
B denote infinitely small magnitudes. Then f(x + a, y + B) — f(x,y) =[f(x + 
a,y+ B)-flxt+a,y)]+[f(« + a, y) — f(x, y)] is infinitely small. This is so 
because the expression in the first bracket is an infinitely small magnitude due to 
the continuity with respect to the second variable, and the expression in the second 
bracket is such because of the continuity with respect to the first variable. 

What happens to our counterexample and the first of the two brackets for 
x = y = 0? The counterexample reduces to aB/(a* + 6”) and this is not always 
infinitely small when a and ® are; a relevant case is a= B #0. On other 
occasions, Cauchy readily gives counterexamples, and it is safe to assume that he 
had seen an example like the one just given. If that is so, then the interpretation to 
be given to his assumption is that f(x, y) is continuous with respect to y for every 
fixed x =x, + a,, with x, real and a, a fixed (!) infinitely small magnitude. But 
this seems to contradict the initial description of infinitely small magnitudes as 
variables that converge to 0! 

Cauchy’s Theorem I on convergent series of continuous functions places us in a 
similar dilemma. Let s,(x) denote the necessarily continuous partial sums of a 
series of continuous functions, and suppose that this sequence converges for all 
x in an interval to a limit function s(x). Then 


s(x +a) — s(x) = [s(x + a) —5,(x + @)] 


+[s,(x + @) —5,()] + [5,(2) - s(x]. 
If a is an infinitely small magnitude, then, because of the continuity of s,, the 
expression in the second bracket on the right is always infinitely small. Because of 
the convergence at x, the absolute value of the expression in the last bracket is 
certainly less than a given real « > 0 for n > Ne). If this is also true for the 
expression in the first bracket, then we obtain |s(x + a) — s(x)| < 3e for arbitrary 


7A reader influenced by today’s school mathematics may be shocked by the expression ‘the function 
f(x), which we use in this paper in connection with Cauchy. This expression is entirely unobjectionable 
if, as has usually been the case since Cauchy, we use the last letters x, y, z,u,v,... of the alphabet as 
symbols for variables, and indexed letters x9, y,,..., or the first letters a,b,c,... of the alphabet, as 
symbols for fixed values. The domains of definition, in our case invariably intervals, are inferred from 
the context. The fear of interchanging function and functional value is a problem for beginners and not 
for advanced and cautious users. 
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positive real e. Hence s(x + a) — s(x) is an infinitely small magnitude. But then 
s(x) is a continuous function. Cauchy states this as a theorem: “If the different 
terms of a series are functions of a variable x which are continuous with respect to 
this variable in the vicinity of a particular value for which the series is convergent, 
then the sum of the series is likewise a continuous function of x in the vicinity of 
this particular value.” 

Here is the original text: “Lorsque les différens termes de la série sont des 
fonctions d’une méme variable x, continues par rapport a cette variable dans le 
voisinage d’une valeur particuliére pour laquelle la série est convergente, la somme 
s de la série est aussi, dans le voisinage de cette valeur particuliére, fonction 
continue de x.” 

We see that the assumed continuity in the vicinity of x = x, is used to prove 
that s,(x, + a,) converges to s(x, + a,) for a real fixed x, and an infinitely small 
fixed ay. 

It is certain that Cauchy knew the relevant counterexamples—Fourier series of 
discontinuous functions—very well, for he had used them in his research for years. 
More simply, put s(x) = nx*/(1 + nx’). This sequence of continuous functions 
converges to 1 for all real x # 0 and to 0 for x = 0. What happens in the vicinity of 
x = 0? For x = a we consider different “variables with limit 0” and test s,(a) for 
convergence. For this we must use Cauchy’s concept of convergence. 

Convergence of the s, to a fixed limit s translates into “it is necessary and 
sufficient that, for infinitely large values of n, the sums s,, 5,.1,5,42, etc... should 
differ from the limit s, and thus from one another, by infinitely small magnitudes.” 
Hence if n and n’ are infinitely large numbers, then s, — s,, must be an infinitely 
small magnitude. As Cauchy explained early on, an infinitely large number is a 
variable magnitude with limit +. 

Let’s choose for our example a fixed, infinitely small magnitude a = m~', m an 
infinitely large number. If we put n = m? and n’ = m, then 

1 
m+1° 
and this is not infinitely small. Since the assumption of convergence in the vicinity 
of x = 0 is not fulfilled, we have no counterexample. 

It is not surprising that Cauchy’s theorems were thought to be false. First Abel, 
who greatly admired Cauchy’s rigor in analysis, expressed scepticism. He did this in 
a footnote to his paper on the binomial series published in 1826, in which he also 
hinted at Fourier series. After the appearance of Robinson’s book (1966), mathe- 
maticians have again tried to take seriously Cauchy’s pronouncements about 
infinitely small magnitudes and about infinitely large values of the number n. By 
now there are scores of discussion papers. A comprehensive account was published 
by Bottazzini in 1992. 

If one wants to understand these objectively existing obscurities, then it is not 
enough to look at just these two theorems. Rather, one must look at the overall 
development of Cauchy’s mathematical views. It is hardly plausible that this great 
and versatile mathematician should have failed in the very first two theorems of his 
1821 textbook. This becomes even more unlikely if we bear in mind that, as we 
shall see below, the second theorem, the one dealing with the continuity of the 
sum of a series, played a fundamental role in the further development. 

In his many papers on mathematical physics and on Fourier analysis that date 
back to the years 1815-1820, Cauchy refrained from references to infinitely small 


1 
4(@) ~ 5(a) = 5 - 
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magnitudes and spoke only of the “very small.” In the 1820s, frequently in the 
same context as before, he spoke about infinitely small numbers, and treated them 
as independent mathematical magnitudes, without recourse to the concept of a 
variable. In the first few years after 1816, while teaching at the Ecole Polytech- 
nique, Cauchy avoided the use of the infinitely small. This provoked growing 
criticism on the part of his colleagues, including the physicist Petit, who empha- 
sized the didactical and practical advantages of the use of infinitely small magni- 
tudes. In 1819 and in 1820, the Conseil d’Instruction at the Ecole exerted strong 
pressure on Cauchy, but this alone would not have made this rather stubborn man 
change his mind. Around 1820, he must have realized that infinitesimal considera- 
tions were a powerful research method at a time when he was in a state of constant 
rivalry, especially with Poisson. Nevertheless, he insisted on using the derivative 
function instead of the quotient of infinitely small differentials, and assigned to 
infinitely small magnitudes a role linked to the concept of continuity. 

In this connection, Cauchy said in the introductory remarks to his Cours: “En 
parlant de la continuité des fonctions, je n’ai pu me dispenser de faire connaitre les 
propriétés principales des quantités infiniment petites, propriétés qui servent de 
base au calcul infinitésimal.” Beginning in 1823, in introductions to books, he 
claimed consistently to have harmonized the intuitive appeal of infinitely small 
magnitudes with the rigor of analysis. 

In the 1820s Cauchy was a very busy man. He taught not only at the Ecole 
Polytechnique but also at other institutions, and he published thousands of pages 
each year. The care with which these works were composed is admirable. Having 
recognized the usefulness of infinitely small magnitudes in research, he must have 
been tempted not only to use them in concurrently written textbooks but also to 
justify them rigorously. This was a stepwise process. At the beginning we still 
encounter the traditional locutions about variables with limit zero or infinity. In 
time, the infinitely small magnitudes, soon referred to as numbers, acquire inde- 
pendence and are handled like ‘genuine’ numbers. The year 1823 witnessed a 
breakthrough in the form of an abstract theory, to be discussed in Section 7. 

We can look at the temporal sequence of Cauchy’s books and concurrent 
research papers from the viewpoint of the genesis of a new theory. In this context, 
from which one must not dislodge individual theorems and subject them to critical 
inspection in isolation, a substantial portion of the historical evolution of mathe- 
matics becomes understandable. 


6. CAUCHY AND THE BINOMIAL SERIES. In Chapter VI of his Cours d’analyse 
Cauchy treated the problem of the expansion, “to every possible extent,” of the 
function (1 + x)”. After Newton, the binomial series had found many uses, and 
Euler and others had tried in vain to find a proof of its validity for arbitrary real yp. 
Bolzano’s correct proof of 1816 was overlooked. Nowadays, we make use of 
Taylor’s formula with remainder. 

For x with |x| < 1 Cauchy defines 


re w— 
b(n) =1+ He 4 MAO ey 


as a function of the real variable . He had proved earlier that this series 
converges and that the function satisfies the functional equation ®( ~)- BC uw’) = 
O( + pw’). The partial sums of the series are polynomials in mw, and thus 
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continuous functions. By Theorem I on series, ®( ) is also a continuous function 
of p. 

As proved earlier, (1 + x)” is continuous because it is an exponential function. 
Thus we have two continuous (nonzero) solutions of the functional equation. But 
Cauchy proved that there is only one continuous solution of this functional 
equation with ®(0) = 1. Hence (1 + x)” is equal to the series. 

As was pointed out by Bottazzini (1992), this proof can be viewed as a key that 
enables us to understand how Cauchy constructed his Cours. In his proof Cauchy 
may be said to be using everything treated earlier: continuity, convergence, 
Theorem I that links the two, and the treatment of functional equations. Cauchy’s 
handling of the proof illustrates the technical superiority of his conceptual ap- 
proach to the algorithmic methods of his predecessors. The same is true of the 
theorem on the intermediate value property. Cauchy proves it, in an appendix to 
his Cours (Note III), for continuous functions, whereas his predecessors, including 
Euler, failed to prove it even for the special case of real polynomials. The 
inference is drawn not from the algebraic property of being a polynomial but from 
the conceptual property of continuity. Bolzano noticed this already in 1817. Unlike 
Bolzano, Cauchy does not mention that he has managed to solve two ‘much-courted’ 
problems. | 

As a corollary, putting » =1/qa and replacing x by ax, with |x| < 1/a, 
Cauchy obtains the series 


2 3 


X Xx X 
(ltax)/““=1+ 5+ >(l- a) +7(1-a)(1- 2a) +-~, 


which yields the exponential series for a infinitely small. We recognize Euler’s 
setup, but this time the proof is correct. The same holds for the second corollary, 
which deals with the series for the logarithm: 


n pw log(1 +x) ; 


(1 +x)” _ e & log(1 +x) _ 1 7 a oe) 
yields 
— (14+x)"-1 
lim ————-———_ = log(1 +x). 
uO bh 


On the other hand, by the binomial theorem, 


1+x)"-1 x? x? 
A ax Sa - +a-wli-$)- + 


For yw — 0 this yields the series for the logarithm. Those acquainted with Euler’s 
Introductio will notice to what extent Cauchy reminds them of his predecessor. 
After all, the Cours is subtitled Analyse algébrique, and treats the circle of topics of 
the Introductio ‘algebraically,’ without applying the ‘transcendental’ methods of the 
differential and integral calculus. 


7. CAUCHY’S CONCEPT OF NUMBER. Euler’s operating with infinitely small 
and infinitely large numbers was not very problematic when it came to expressions 
and their algorithmic transformation in accordance with the ordinary (rational) 
rules of computation. 
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This is different in Cauchy’s case, because Cauchy thought in concepts. If a 
function f(x) was determined by its values f(x,) for all real values of the 
argument x,, how was it to be determined for values x, + a with a infinitely 
small? The vague talk about variables with limit 0 had to be made precise if, as 
Cauchy did ever more frequently, one wanted to talk about infinitely small 
numbers. One gets the impression that Cauchy felt this need when he wrote his 
second textbook, the Résumé of 1823. He introduced his system of infinitely small 
numbers. We recall the formation of the real numbers and then talk about 
Cauchy’s analogous, but more general, number concept. 

Given the rational numbers, one of the possible ways of creating the reals is 
even connected with Cauchy’s name. We form sequences of rational numbers and 
single out the Cauchy sequences (a,): (a,,) is a Cauchy sequence if for every « > 0 
there is an N(e) such that |a, —a,|<e for all n,m > N(e). Every Cauchy 
sequence represents a real number, and two such sequences represent the same 
real number if and only if their differences form a null sequence. (One began to 
think in terms of equivalence relations around 1900; I am using here locutions 
found in the works of Heine and Cantor (1872).) 

Cauchy can assume the real functions as given. A real function f(u), defined for 
u > 0, represents an infinitely small number if lim f(u) = 0 for lim u = 0. Cauchy 
says that the function g(u) = u represents the basis w of his system and writes 
f(@) for the number represented by f(u). It is obvious that one computes with 
these numbers the way one computes with the functions that represent them. 
Cauchy mentioned his conception many times but did not advance it much. In 
particular, he provided no corresponding explanation for infinitely large numbers 
which, as we saw in Section 5, he used very often. 

A consistent development of Cauchy’s setup brings us to a general definition of 
Cauchy numbers (Laugwitz 1991): 

Every real function f(u) defined on an interval 0 < u < p represents a Cauchy 
number. Two such functions f(u) and g(u) represent the same Cauchy number if 
and only if there is an interval 0 < u < q in which f(u) = g(u). We write w for the 
Cauchy number represented by the identity function. 

Operations and relations are defined for Cauchy numbers in a fairly obvious 
way by means of operations and relations for their representatives. 

f(@) = g(w) if and only if f(u) = g(u) in an interval 0 <u < p. Similarly, 
f(@) > g(w) if and only if f(u) > gu) in an interval 0 < u < p. 

w ' > +r for every real r; w ' is an infinitely large number. The real numbers 
are the Cauchy numbers represented by the constant functions h(u) = r. 

Note that this introduction of the Cauchy numbers is analogous to the genera- 
tion of the real numbers in terms of rational sequences, which are rational 
functions over the integers. When we work with real numbers we don’t think of 
their genesis in terms of rational sequences. Similarly, once we get used to working 
with Cauchy numbers we don’t have to think of their genesis in terms of real 
functions. 

We can now tell how to obtain a real function F(x) defined on the interval 
Xy9 <x <X for a Cauchy number B = b(@) in this interval. F( 8B) = F(b(w)) is 
represented by F(b(u)). This makes sense, because for small u > 0 we have 
X) < blu) < X. Without having to mention it every time, F( 8) is a well-defined 
Cauchy number. 

The continuity property F(x + a) — F(x) = 0 for a = 0 is also set up in this 
way, with a = 0 denoting infinitely small a. 
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Cauchy speaks of systems in the context of group theory as well. A group—this 
is a word he does not use—is a system whose elements are represented by, say, 
substitutions (i.e., permutations of a finite set). In modern terms, his systems of 
numbers are vector spaces, and even rings. This doesn’t do much for us, but it 
explains why his formulation was too abstract for his contemporaries. This was also 
true of his and Galois’ group theory. 

Today group theory is described in the language of modern algebra. We can try 
to do this for the Cauchy numbers as well. 

Consider the ring R of real functions defined in a right neighborhood of 0, and 
in this ring, the ideal J of functions that vanish in a right neighborhood of 0. Then 
R/I is the ring of Cauchy numbers. If we use the axiom of choice and extend J to 
a maximal ideal J, then R/J is a nonarchimedean ordered field that corresponds 
to the *R of Robinson’s nonstandard analysis. Obviously, there are important 
properties of R/I and R/J other than their algebraic and order properties. One 
such is the possibility of continuing real functions from R to these extension rings. 

My paper of 1991 contains a number of applications of these concepts. Here I'll 
limit myself to showing the logical equivalence of the two definitions of continuity. 

Let F(x) be a real function defined in a neighborhood of x). The two 
definitions are: 


(I) F is e — &continuous at x, if for every e > 0 there is a 6 > 0 such that 
|F(x) — F(x,)| < € for all real x with |x — x,| < 6. 
(I) F is C-continuous at x, if F(x, + a) — F(x,) = 0 for all a = 0. 


(I) implies (II): Let @ be represented by a(u). Since a = 0, it follows that 
|a| < 6, and thus also |a(u)| < 6 for small u > 0. But then |F(x, + a(u)) — F(x9)| 
<e for small u > 0, and so |F(x, + a) — F(x,)| < € for every positive real e. 
(II) follows. 

(II) implies (I): Proof by contradiction. Suppose that (I) is false. Then there is an 
exceptional e, say p > 0, for which there is no corresponding 6. In particular, for 
every u > 0 there must be an a(u), 0 < a(u) < u, with |F(x, + a(u)) — F(x,)| = p. 

For a = a(@) we have a contradiction to the assumption (ID). 

The Euler-Cauchy convergence condition s, — s = 0 for all infinitely large n is 
now a provable theorem. If it holds, then the e-condition follows. Otherwise there 
is an exceptional €, say p > 0, such that for every u > 0 there is an N(u) > 1/u 
with |s,,,, — s| =p, and thus not infinitely small. The converse is obvious. 


8. DIFFERENTIAL AND INTEGRAL. The account of the foundations of the 
differential and integral calculus, as we know them, goes back to Cauchy who, of 
course, presented them in terms of infinitesimal mathematics and not in terms of 
epsilontics. Given its historical importance, we reproduce it in brief form. 

If a function f’(x) satisfies the condition 


+9) IO) py 


for all infinitely small a, then it is called the derivative of f(x). All rules follow 
easily, especially if we write 


f(xt+a)=f(x) +f'(x)a + o(x, a)a with o(x, a) = 0. 
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Cauchy views the differential dy of y = f(x) as a function of two variables, 
dy = f'(x) dx, which is linear with respect to the real variable dx. 

In his Résumé of 1823 Cauchy treated the differential and integral calculus. It is 
noteworthy that he proved the integrability of continuous functions. It is possible 
that he was motivated by the desire to bring continuity into prominence. In the 
18th century, integrals were viewed as antiderivatives, and this no longer satisfied 
the new needs of mathematical physics that required the integration of functions 
for which no antiderivative was known. 

Let f(x) be continuous on the interval a <x < b, and let a =x, <x, < -: < 
Xn <Xng1, <1 <Xy = Ob be a subdivision with a, =x, —x,_, ~ 0. All the 
numbers in the n-th subinterval are infinitely close to the same real number and, 
because of the continuity of f(x), the corresponding functional values differ by 
infinitely little from the functional value at that point, so that there are numbers 
M,, and m,, with M, — m, ~ 0 and M, => f(x) = m,, in the n-th subinterval. Using 
the concepts introduced by Darboux only around 1870, we can define the upper 
integral of f as the infimum of all upper sums and the lower integral as the 
supremum of all lower sums. (Cauchy used intermediate sums.) It is clear that no 
lower sum is greater than an upper sum. If we can show that the upper sum 
uM,a, and the lower sum Um, a, associated with our infinitesimal subdivision 
differ by infinitely little, then both are infinitely close to the same real number that 
is the definite integral {°f(x) dx. 

For every real p > 0 we have 0 < M, — m, < p. Hence 


O< LM, a, _ eM, Uy — (MM, _ M,,) Op <p dia, = p(b — a). 


Thus the difference between the upper sum and lower sum is indeed infinitely 
small. 

This very intuitive proof, like Cauchy’s own, avoids the explicit use of the 
uniformity of continuity. 


9. FOURIER SERIES OF CONTINUOUS FUNCTIONS. Having obtained the 
fundamental theorems on differentiation and integration one can forget the way 
they were derived, because all that counts now is these theorems. That is why the 
infinitely small numbers retained the right to exist only in physics and in differen- 
tial geometry, e.g., for the derivation of differential equations from the considera- 
tion of infinitely small elements of space and time. 

But if one had a tidy theory of infinitely small numbers, as Cauchy did, one 
could also use it in other ways, and this was done in Paris around 1820. As a 
relevant example we consider Fourier series. The argument that follows is found 
not only in the works of Cauchy and Poisson but also in a surviving manuscript of 


* Besides, the derivative function, as defined by Cauchy, is continuous in its own right. Indeed, if we 


make the variable substitution z = x + a and B = —a, then we have: 
f(z+B) feta) | 
PQ) = BD, 


so that f(x + a) = f’(x) for every infinitely small a. 

A school mathematician may be shocked that all derivative functions are continuous, but in research 
in the area of modern analysis one almost always assumes continuous differentiability. This property is 
logically equivalent to uniform differentiability: f’(z) = lim(f(x + 4) — f(x + k)/(A — k) for h,k > 0. 
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Gauss that probably dates back to 1815. After a simple transformation, the Euler 
series in para. 4 yields 
(A) 1/2 + qcos x + q* cos2x + gq’? cos3x + +++ +q" cos nx + °° 
1 — q? 
= —_____—__—__ =: 76 . 
1 — 2qcosx + q’ 78(x) 


3 
Fore=1-—q=0,0<w<x <27- wo, w= Ve = 0, we get, with Cauchy, 


; 1—-q’ e(1+q) e(1+q) 
< eS aS 
1—2qceosx+q? 1-—2q(1-—2sin*?x/2)+q* €*+4qsin?x/2 
E E 
< aay OS GT 
sin* x/2 sin* w/2 


The function 6(x) defined in (A) is 27-periodic, positive everywhere, and 
infinitely small outside of intervals of infinitely small length around the values 
x = 2ag with integral g, and its integral over an interval of length 27 is 1. (The 
latter is obtained by termwise integration of the series on the left. That this is 
permissible can be seen by considering the majorizing geometric progression.) 

Today we call such a function a Dirac delta function. The delta functions, 
outlawed in standard real analysis, turn out to be magic wands in analysis and in 
physics. One knew this around 1820, but then the whole thing was forgotten. 

If f(t) is continuous and periodic with period 27, then 


(B) f(t) = [8(x) f(x +t) de = [8(x — 1) f(x) ax, 


where the integrations extend over any interval of length 27. Indeed, if g(x) is 
continuous and periodic with period 277, then 


+ —w +a +7 
8(x)g(x) dk = + + , 
f d(a)a(x)de= f+ f+ fe 
Here the first and last integrals on the right are infinitely small because 6 is. Since 
f=2°8(x) dx = 1 and g is continuous, the mean value theorem assigns to the 
middle integral a value that is infinitely close to g(0). 
Now we apply this to 


1 1 
5(x — t) = a7 + 7 yg" cos n(x — t) 
n>1 
1 1 , 
= — + — )° g"[cos nx cos nt + sin nx sin nt]. 
27 T st 


For the same reason as before, we can integrate termwise. Using the Fourier 
coefficients 


1 +0 1 .4+0 
a, = —| f(x)cos nx dx, b, = —| f(x)sin nx dx, 
TT — 7 TT _ 7 
we get from (B) 


(C) f(t) = 


a 
— + ¥ q"(a,cosnt + b, sin nt) 
2 n>1 


for continuous 27-periodic f(t) and 1 > q = 1. 
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Unfortunately, we cannot subsequently put g = 1: this would yield a Fourier 
series. Since 1873 we have known that continuity of a function does not suffice for 
it to be equal to its Fourier series. This result of du Bois-Reymond came as a 
surprise. In the unpublished manuscript mentioned earlier, Gauss drew the false 
conclusion that it does. Cauchy was more careful. He said that the Fourier series 
(D) “oy )) a, cos nt + b, sin nt 

2 n>1 
is equal to the function if both series (C) and (D) converge. The reader can verify 
this. 
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More Mathematical Double Entendres: 


Open Set: Something that Pete Sampras usually wins at Forest Hills 
Arcsine: What Noah used to direct the animals to his boat 
Inaccessible Cardinal: A VIP at the Vatican 

Cyclic Group: A bunch of people in leather jackets riding Harleys 
Compact Manifold: A part of the engine of a Honda Civic 

Splitting Field: Something to look out for near the San Andreas fault 
Mean Deviation: An activity involving especially nasty behavior 


Contributed by David Sprows, Villanova University 


Q: What is |Rogers — x| < e? 
A: Mr. Rogers’ neighborhood. 


Contributed by Eric Key, University of Wisconsin—Milwaukee 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 
with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, Richard Pfiefer, Leonard Smiley, John Henry Steelman, 
Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before February 28, 1998; Additional information, such as gencralizations and 


references, is welcome. The problem number and the solver’s name and address 
should appear on each solution. An acknowledgement will be sent only ifa mailing 
label is provided. An asterisk (*) after the number of a problem or a part of a 
problem indicates that no solution 1s currently available. 


PROBLEMS 


10606. Proposed by Thomas Zaslavsky, Binghamton University, Binghamton, NY. Given a 
positive integer m, show that there is a positive integer n such that, for any group G of order 
at least n, it is possible to choose m elements g1, g2,..., 2m Of G so that no product of the 
form ar si _ 8 with 1 < k < m and distinct subscripts ij, i2,..., i, in {1,2,...,m} 
equals the identity. 


10607. Proposed by Juan-Bosco Romero Marquez, Universidad de Valladolid, Valladolid, 
Spain. Evaluate 


n—0o 


| 2e 44% 4-..4(2n)* \" 
m eS 
$3" $----+ Qn — 1 


for x > 0. 


10608. Proposed by Victor Zalgaller, Steklov Mathematical Institute,St. Petersburg, Russia. 
Let S be a compact convex set in the plane. If / is any line of support for S, let f (J) be the 
length of the shortest curve that begins and ends on / and that together with / surrounds S. 
Prove that if f(/) is independent of /, then S is a circle. 


10609. Proposed by Donald E. Knuth, Stanford University, Stanford, CA. Let 
_ n _ pyn—kep yk 
a(l,m,n)=) >) (7) (+m—k)"*(k—D*. 


Prove that 


ye a(l,m,n) = manta, m,n) — men 
10610. Proposed by Richard Hall, University of Portsmouth, Portsmouth, England. Given 
a positive integer m, let C(m) be the greatest positive integer k such that, for some set S of 
m integers, every integer from 1 to k belongs to S or is a sum of two not necessarily distinct 
elements of S. For example, C(3) = 8 with S = {1, 3, 4}. 
(a) Show that, for all ¢ > 0, 1/4 < C(m)/m? < 1/2 + € for all sufficiently large m. 
(b)* Improve the asymptotic bounds in part (a). 
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10611. Proposed by Zoltan Sasvari, Technical University of Dresden, Dresden, Germany. 
Find the largest value of a and the smallest value of b for which the inequalities 


1+<V1—-e-4*? 1+V1—e->x 
2 2 


< D(x) < 
hold for all x > 0, where 


P(x) = -y?/2 dy. 


1 Xx 
—= e 
20 [. 
10612. Proposed by John P. Robertson, Anistics/Aon, New York, NY. Fermat proved that 
there are no nontrivial 4-term arithmetic progressions all of whose terms are integer squares. 
(a) Find all 5-term arithmetic progressions such that all terms but the fourth are squares. 
(b) Call two arithmetic progressions essentially different if the ratios of corresponding terms 
differ. For each integer m > 6 show that there are infinitely many essentially different m- 
term arithmetic progressions such that the first 3 terms and the mth term are squares. 


SOLUTIONS 


A Fairly General Family of Integrals 
10393 [1994, 573]. Proposed by Jean Anglesio, Garches, France. Show that 


[ ew (1-e™)" dx _ (-1)’ \- (7) (-1)*(a + kyo! log(a + k) 


x? —r- ibe 


where a > O and 1 <r <n (except fora =0,r = 1). 


Solution by Jet Wimp, Drexel University, Philadelphia, PA. The Gamma function (a) is 
analytic foro ~ 0, —1, —2,...; it satisfies "(o + 1) = oI(o), sol (r) = (r — 1)! when 
r is a positive integer. Also, we have 


d 
G(@tr-v 


, , a(o +1)...(0 +r — 2) 
lim —_— —_—_———_- = 1m —_— 
o—>l—rdo V(ac) o—l-r do 


I(o +1) 
( d (Aa “e+ eer) 
= lim [{(o0 +r —1)—{ ——— SEES aE EEE EERE 
o->l—r da (o +1) (o +1) 
= (-1) (7-1)! (A) 
forr = 1,2,.... 


Expanding ( l-—e~ )" by the binomial theorem, integrating term by term, and using 
fo. e~P*x°-1 dy =T(a)p~° for p,o > 0 yields 


n 


. > @looxc +k)? 
| eo ( 1 —- e* y"xoo! dx = k=O 
0 1/T() 


fora > 0. The range of validity of the integral in (B) may be extended from o > 0 to 
ao > —n. In fact, this integral is analytic for complex o with real part greater than —n. 

Foro = 1-—r,r=1,2,...,n, the night side of (B) is indeterminate: the sum in the 
numerator is zero (it is an nth difference of a + k to a nonnegative integral power less than 
n) and the denominator must be zero in the limit since the integral is clearly not. We now 
take the limit of (B) aso — 1-—r,r =1,2,...,n using L’Hospital’s rule. Differentiating 
the numerator using — (a+k)~° = —(a+k)° In(a +k) and the denominator using (A) 
gives the result. 


(B) 
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For other values of o for which the integral exists, its value is given by the formula 
(B). Extension of the result to a = 0 for values of r for which the integral converges, i.e., 
r=2,...,n, is trivial. 


Editorial comment. Michael Vowe noted that the cases r = 1, 2,3 can be found in L. S. 
Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, prepared by A. Jeffrey, 
Academic Press, 1980 (formulas 3.411/19 p. 377, 3.411/20 p. 377, and 3.443/2 p. 386). He 
also noted that the case of all r < n is contained in a general formula in section 167 of T. J. 
Bromwich, An Introduction to the Theory of Infinite Series, Second Edition, Macmillan, 
1926, p. 473. Murray S. Klamkin reported that the case r = n appears in Pi Mu Epsilon 
Journal, Problem 805 [1993, 545; 1994, 703] and extended the method of J. S. Frame’s 
solution of that problem (integration by parts, with a careful consideration of the behavior 
at the endpoints) to cover all cases of this problem. 

Solved also by D. Borwein (Canada), P. Bracken (Canada), D. Bradley, E. Braune (Austria), D. Callan, R. J. Chapman 
(U. K.), H. Chen, J. L. Collett (U. K.), L. Euler (transmitted by D. Zeilberger), P. Flajolet (France), M. Golomb, S. A. 
Greenspan, J.-P. Grivaux (France), H. van Haeringen (The Netherlands), R. Holzsager, G. L. Isaacs, F-A. Izadi (Iran), 
G. Keselman, P. Khajeh-Khalili, M. S. Klamkin (Canada), M. J. Knight, O. P. Lossers (The Netherlands), K. McInturff, 
K. D. McLenithan, A. Pechtl (Germany), A. Pedersen (Denmark), H. Prodinger (Austria), M. Vowe (Switzerland), 
T. White, A. N. ’t Woord (The Netherlands), K. Zacharias (Germany), P. J. Zwier, NSA Problems Group, Prague 
Problem Solution Group (Czech Republic), WMC Problems Group, and the proposer. 


A Persistent Distribution 


10394 [1994, 575]. Proposed by Ignacy I. Kotlarski, Oklahoma State University, Stillwater, 
OK. Let N, Z1, Z2, ... be asequence of independent random variables, where N follows the 
geometric distribution with Prob(N =n) = p(1— p)"—! forn = 1,2,... withO < p <1, 
while the Z; are identically distributed complex random variables Z; = Xj; + iY; where 
(X;, Y;) have density 


a 

—|z\@-? for |z| < 1 
fale. | | 

0 for |z| > 1 


where z = x + iy anda > 0. Find the distribution of W = Z, - Z2--- Zn. 


Solution by David Callan, University of Wisconsin, Madison, WI. W has the same distribu- 
tion as the Z; with a replaced by pa. 

Switching to polar coordinates, Zj) = Xj; + iY; = Ryje'©i where Rj and @; are 
independent, ©; is uniform on (0, 27), and Rj; has the power density with parameter 
a: f(r) = ar? for0 < r < 1. Thus W = Re!® where R = R,R2---Ry and 
© = ©O; + ©2+---+ Oy reduced modulo 27 and R, © are independent. 

It is clear that the distribution of © is uniform on (0, 277). Indeed, if U, V are independent 
random variables with V uniform on (0, 27) then W = U + V reduced modulo 27 is also 
uniform on (0, 277) regardless of the distribution of U. Thus, for any fixed k, ©; + ©2 + 
-++-++ ©, reduced modulo 27 is uniform on (0, 277), and so @ is also. 

As for the distribution of R, let U; = —log R;. Then U; has the exponential density 
with parameter a: g(u) = ae~™ for u > 0, whose moment generating function is w(t) = 
don>0 E(U;)t" /n! = a/(a—t). Hence the moment generating function of U = U,; + U2+ 
... + Un is 

J Prob(N = n)w(t)” = py(t) S10 — py" wo! =§ 

n>1 n>1 pa—t 
Thus U has exponential density with parameter pa and so R = e~” has the power density 
with this parameter. 


Solved also by R. A. Agnew, N. Bouzar, W. J. Buhler (Germany), R. Ehrenborg (Canada), J. A. Grzesik, V. Hernandez 
(Spain), S. J. Herschkorn, R. Holzsager, G. Keselman, J. H. Lindsey II, O. P. Lossers (The Netherlands), D. K. Nester, 
T. Shore & D. B. Tyler, T. White, A. N. ’t Woord (The Netherlands), and the proposer. 
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Winding Around Points in a Chessboard 


10401 [1994, 682]. Proposed by Donald E. Knuth, Stanford University, Stanford, CA. A 
Closed knight’s tour of an m by n chessboard is a sequence ( (xx, yx) ) forO < k < mn 
such that each pair of integers (x, y) with O < x < m andO < y <n occurs exactly once 
in the sequence, and (xx41 — xz)? + (Ye41 — ye)* = 5 for all k (including k = mn — 1 
with (Xmn, Ymn) = (Xo, yo)). Such a tour defines a closed contour C if we connect adjacent 
points (xz, ye) and (xe41, Ye+1) with straight line segments. 

Let w;; be the winding number of C about the point (i — 1 j- 5): Prove that 


mn—| —In—-1 
Ye yeb — Xe+1 Ye) = 2 Ss So wij. 
k=0 i=1 j=l 


Solution by Richard Stong, Rice University, Houston, TX. Let Ax(C) = )-y"9 | cee — 
Xx+1yk). Note that Ai(C) can be interpreted as the signed area inside the contour C (a 
point P being considered as inside C with multiplicity the winding number of C around 
P). We show the following more general fact. 


Proposition. Let C be a polygonal closed contour with vertices (xo, yo), (%1, y1),---; 
(xn-1, YN-1), (XN, YN) = (Xo, yo) such that all the vertices have integer entries and no 
edge of C goes through any point of the form (i — 1/2, j — 1/2) for integeri and j. Let 
wij(C) denote the winding number of C about the point (i — 1/2, j — 1/2). Then 


As(C) = )° 5S > wi(C). (*) 
i oj 


Proof. First note that both sides of the equation (*) are additive under the following basic 
Operation: take the disjoint union of two closed contours C; and C2 and remove any edges 
that cancel. Next note that (*) holds if C is the standard contour around a unit square (with 
vertices (a, b), (a + 1,b), (a + 1,b + 1), (a, b + 1), (a, b)), since the area is 1 and the 
winding number wg+1,541(C) = 1 and all other w;;(C) = 0. By symmetry, (*) also holds 
for the reverse of this contour. Thus, by repeated use of the basic operation, we see that 
(*) holds if all edges of C are parallel to the coordinate axes. Also note that (*) holds for 
the triangular contour C with vertices (0, 0), (a, 0), (0, b), (0, 0) with a > 0, b > O and 
exactly one of a and b even (so that no point of the form (i — 1/2, j — 1/2) lies on the edge 
from (a, 0) to (0, b) ). For this contour, the area is ab/2, and it has winding number 1 about 
the ab/2 points of the form (i — 1/2, 7 — 1/2) in its interior. By symmetry (*) also holds 
for the contours obtained from C by the rigid motions of the plane that preserve the integer 
lattice. Repeated use of the basic operation shows that (*) holds for all contours. 0 


The proposition holds for all polygonal closed contours with integral vertices if one 
interprets the winding number about points of the form (i — 1/2, j — 1/2) on the boundary 
correctly. The method of proof suggests that this result is a form of Pick’s theorem. Indeed, 
it can be obtained by applying Theorem 1 of B. Griinbaum and G. C. Shephard, Pick’s 
Theorem, this MONTHLY 100 (1993) 150-161 to the integer lattice and the lattice consisting 
of integer points and points of the form (i — 1/2, j — 1/2). 

Solved also by E. Fernandez Moral (Spain), M. Hoffman, R. Holzsager, J. H. Lindsey II, O. P. Lossers (The Netherlands), 
A. Nijenhuis, NSA Problems Group, and the proposer. 


An Improper Double Integral 


10411 [1994, 911]. Proposed by Gord Sinnamon, University of Western Ontario, London, 
Ontario, Canada. Let R be the region inside the unit circle and above the line x + y = 1. 


Calculate 
| | ore dx dy (%) 
Tog Ue + (logy)? xy — 
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Solution I by Thomas M. McDonald, Gannon University, Erie, PA. The value of (*) is 
mw In2/2. We show that if 0 < a < band R is the first quadrant region bounded by the 
curves x? + y* = 1 and x? + y? = 1, then the value of (*) is (7/2) In(b/a). 

The substitutions u = Inx and v = Iny transform (x) to {f,dudu/(u? + v*), where 
S is the third quadrant region between the curves e?” + e?” = 1 and e@ + e%” = 1. The 
substitutions u = rcos@ and v = rsiné@ give {{,(1/r7) rdrd6, where T is the region 
between the curves e?” ©°S9 + gor sin? — 1 and e489 4 ear sin? — 1. Tf £(6) is the function 
defined implicitly by the equation ef) °s9 4 ef(@)sin? — 1, then the required value is 


3/2 pf(@)/a 1, 32 /2 31/2 
[ } rdo = | (m2 _; nO) a0 = | Indo = InZ. 
f(0)/b 74 1 a 4 a 2 a 


All of these manipulations are justified because each integrand is nonnegative and the final 
expression converges. 


Solution II by O. P. Lossers, University of Technology, Eindhoven, The Netherlands. Because 
logx _ log x 


— arctan = aD? 
dy log y y(log’ x + log” y) 


(*) equals 


[ dx ees 
lim — arctan = 


x=5 X log x log y 


510 y=l—x 
L dx log x log x? 
= lim ——— arctan —~—— — arctan ———- |. 
610 Js x logx log(1 — x) log(1 — x“) 
The substitution x = z” reduces [ os? Floss arctan Toatto) to 


[ 2zdz log z? 
= TT arctan ——>- 
7-5 2°2logz log(1 — 2z¢) 


Hence, (*) is found to be 


5 dx log x i 
lim arctan ————— = 570 log2, 
610 J,—s x logx log (1 — x) 


since, for some € with 6% < ¢ < 5, we have 


§2 
g 
—— arctan ——————- = arctan ————_ . 
| x logx aman log(1 — x) me log(l—e) Js x logx 
The generalization in Solution I can be obtained by the same method. 


Solution III by Jet Wimp, Drexel University, Philadelphia, PA. Define a continuous function 
on [0, 1] by g(0) = 0, g(1) = 1/2, and g(t) = arctan (In(1 — t)/Int) forO <t < 1. We 


have 
r= | —_ dxdy | -[ dx pve 1x? dy 
r xy( (log x)? + (log y)?) 1-x _ y( (log x)* + (log y)? ) 
[ ! Iny\ |v [ g(x?) — ex) | 
= arctan | —— ax = —$$$—$—$—__— 
o xInx Inx/] J)_y 0 x Inx 
l 
d 
= — / * [vs '(t) dt. 
0 xInx 
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Interchanging the order of integration gives 


l Jt l 
l= -| eat | qx -| g'(0)| In in x|] a 
0 + xInx 0 t 


l 
=in2 | g(t) dt = In2(g(1) — g(0)) = = In2. 
0 


Solved also by J. Anglesio (France), R. Bagby, R. J. Chapman (U. K.), D. A. Darling, J. S. Frame, M. Golomb, J. A. 
Grzesik, M. Hoffman, R. Holzsager, W. C. Lang, T. L. McCoy, D. K. Nester, A. Nijenhuis, T. Schonbek, R. Stong, 
D. Swearingen, A. A. Tarabay (Lebanon), M. Vowe (Switzerland), C. Y. Yildirim (Turkey), K. Zacharias (Germany), 
Anchorage Math Solutions Group, NSA Problems Group, and the proposer. 


Balanced Sequences 


10430 [1995, 71]. Proposed by Fred Galvin, University of Kansas, Lawrence, KS, and John 
Isbell, SUNY, Buffalo, NY. Let D(a,,...,a,) denote the sum of the absolute deviations 
of the real numbers a), ..., a, from their median. Call a sequence balanced if the n — 1 
quantities D(a), ..., ax) + D(@g41,..-,4n),k =1,2,...,n — 1 are all equal. 

(a) Show that, for each integer n > 1, a nonconstant balanced sequence of n terms exists 
and is unique up to an affine transformation. 

(b) Characterize the positive integers n for which there exists a strictly increasing balanced 
sequence of n terms. 


Solution by Robin J. Chapman, University of Exeter, Exeter, UK. Statement (a) is false as 
given but becomes true when we consider only nondecreasing sequences. Letbj = aj41—a; 
for 1 < j <n. For part (a), it suffices to show that these differences are uniquely determined 
up to a constant factor and that there is a nontrivial nonnegative solution. For part (b), we 
show that all b;,’s are nonzero if and only ifn e€ {1, 2, 4} orn is a prime such that —1 and 2 
generate the group of nonzero residues modulo n. 

Let dy = D(ay,..., ax) + D(ag41,..., 4) for 1 < k <n. The statement is trivial for 
n < 2, so we assume that n > 3. We first consider odd n; let r = (n — 1)/2 with r > 1. 
For 1 < j <r, we have 


j-l 2j—-1 jtr 2r+l 
dzj-1 = -\ oat > aj — \° ai + > aj; 
i=1 i=jt+l i=2j i=jt+r+l 
J 2] jt+r 2r+l1 
ty=-Yat Ya- Yat Sw 
i=l i=j+l i=2j+l i=j+r+2 
Equality between successive quantities in dj, ...,d2, yields 
Aj: Aj + Oreja = 202; forl <j <r; 
B;: Ajti + Ort joi = 2a2;41 forl<j<r-l. 
Comparing neighboring equations in the sequence Aj, Bi, A2,..., Ar—1, By_1, Ar yields 


bj = 2b2; and by 4 j41 = 2b2;41 for! < j <r—1. Forl <k < 2randk ¢ {r,r + 1}, this 
yields by = 2b 2,), Where (u) denotes the least positive residue of the integer u modulo n. 
Together with A1, which is equivalent to b} = )°; * b;, these relations form a complete set 
of equations for {b;}. If (2°k) ¢ {r,r + 1} for all positive s, then by = 2°b, when S is the 
order of 2 modulo n. It follows that b, = O unless k is congruent to +2’ modulo n for some 
t. When n is not prime, and also when 7 is prime but 2 and —1 do not generate the group 
of invertible residues modulo n, we obtain by = O for some k, and the resulting sequence 
{a,} is not strictly increasing. If T is the least positive integer such that 27 = -+1 (mod n), 
then box) = 27-*—!b and bx) = 27~*—'b’, where {b, b’} = {b,, b-41}. The relation 
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bh = yh b; now becomes 27-!b = ub + vb’ , Where u and v are positive integers with 
u+v=2?-!. Hence b = b’, and so bx) = 27—!-*b, and by = 0 if k # 42° (mod n) 
for all positive s. 

We next consider even n; letr = n/2 with r > 2. The expressions for {d,} become 


j=l 2j-1 j+r-1 2r 
djri=-)ia+ Yo a- Yo a+ DY a; 
i=l i=j+l i=2j i=j+r+l 
J 2] j+r 2r 
aj=-Yoat Da- Yat DY a; 
i=l i=j+1 i=2j+1 i=j+r+l 
Equality between successive quantities in d,,..., d2;—1 yields 
Aj: Aj + Ar4j = 202; forl<j<r-l1; 
Bj: Qjt+i t+ 4r4j41 =2a2j41 forl<j<r-1, 
Comparing neighboring equations in the sequence A;, B,, Az, Bz,..., Ar_1, By_1 yields 


br4j = 2b2; for1 < 7 <r—1landb2;4; =Oforl < j <r —2. Together with Aj, which 
is equivalent to b;} = }~;_, b;, these give a complete set of equations for {bj}. 

We consider two cases, depending on the parity of r. Suppose first that r is even. When 
r = 2, our equations are by + b3 = 2b and by = bo, which require b} = b2 = b3. Suppose 
then that r > 4. In this case b,+; = 0, and thus b) = 2b2 and b2,_| = 2b2;_2. For other 
odd k, we have b, = 0. Expressing A; as 2b) = yi b;, we thus obtain b) = yh a b9;. 
Letting b; = by; for1 < j <r —1, we obtain b} = ye b;, b, + Di io j = by; +br42; = 
2b4j = 2b,, for 1 < j < r/2—1, and 2b5,,, = 2b4j42 = bojti + br+2j+1 = 0 for 
1 < j < r/2-—2. This reduces the case n = 2r with r even to the case where n = r. 
Note that when n = 4, all b; are nonzero, but this fails when n = 8 because we obtain 
b3 = bs = 0 before the reduction. 

Now suppose that r is odd. Our equations become bj + b-41 = 2b2, bp—1 + b2--1 = 
boy—2, by = S12) bj, and by = 2b for even k with O < k < 2r andk Ar +1, where 
(u) again denotes the least positive residue of the integer u modulo n = 2r. If k is even and 
2°k #r+1 (mod 2r) for all s, then by = 2°b,; for some positive s, which yields b, = 0. 
All other even k with 0 < k < 2r satisfy k = +2* for somes > 0. There is a least positive 
integer T such that 27’ =r +1 (mod n), and so for 1 < s < T we have bas) = 27-%b 
and by_2s) = 2'~*b’, where {b, b'} = {b--1, by41. Now by + b-41 = 2bo = 2'b and 
bh = ya b; = ub + vb’, where u and v are nonnegative integers with u > 0, and 
u-+v = 2! —1. Hence either 2’b = ub + (v + 1)b’ or 27? b = (u+1)b + vb’. In the 
former case, b = b’. In the latter case, b = b,11; thus the sum for bj contains the term 
by, = b’, and we obtain v > 0 and again b = b’. 

In all cases we have shown that a nonnegative solution {b;,} exists, unique up to multi- 
plication by a constant. The solution contains no zero terms if and only if n € {1, 2, 4} or 
n is aprime such that —1 and 2 generate the group of nonzero residues modulo n. 


Editorial comment. The condition a, < az < --- < da, appeared in the problem proposal 
but was omitted in the published version; all solvers recognized the need for some such 
condition. The proposers note that in H. Steinhaus, 100 Problems in Elementary Math- 
ematics, Basic Books, 1964, reprinted by Dover, 1979, Problem 56 calls for minimizing 
D(aj,..., ax) + D(ax4i,..., Qn) for a particular sequence of length 120. The solution in 
the book says that kK = 60 minimizes, but k = 72 is better (by 0.24). Since any calculus 
of such problems would include a method of recognizing balanced sequences, it is striking 
that a deep problem of number theory would appear in the characterization in part (b). 


Solved also by D. Beckwith, D. E. Knuth, J. H. Lindsey II, and the proposers. 
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A Sequence of Distinct Integers 


10432 [1995, 169]. Proposed by David M. Bloom, Brooklyn College, CUNY, Brooklyn, NY. 
Let 


P ={péZ*: pis prime and p =3 (mod 4)}. 


For p € P, let S(p) denote the sum of all quadratic residues (mod p) that lie in the interval 
(0, p/2), and let R(p) denote the least positive residue of S(p) (mod p). 

(a) Prove that R is one-to-one. 

(b) Show that there are infinitely many positive integers that are not in the range of R. 


Solution by Thomas Honold, Technical University, Munich, Germany. We have S(3) = 
R(3) = 1. For p > 3, we compute R(p) in terms of the residue class of p modulo 16. 


p (mod 16) 3 7 11 15 


lip-1 7Tp—-1 3p—-1 15p—1 
16 16 16 16 


Let S’(p) denote the sum of all quadratic residues (mod p) that lie in the interval (p/2, p), 
and let N(p) denote the sum of all quadratic nonresidues in the interval (0, p/2). Since 
p is congruent to 3 modulo 4 and is not divisible by 3, 24 divides p* — 1. Also note that 


S(p) + 8'(p) = iP,” 7? = p(p? — 1)/24 = 0 (mod p). 

Since — 1 is a quadratic nonresidue (mod p), x is anonsquare (mod p) if and only if p—x is 
a square (mod p); thus S’(p) = —N(p) (mod p). Hence S(p) = N(p) (mod p), and we 
have 2S(p) = S(p) + N(p) = 2)" k = (p? — 1)/8. We conclude that 16S(p) = —1 
(mod p). Thus 16R(p) + 1 = kp, where 1 < k < 16. Since kp = 1 (mod 16), we have 
k = 11,7,3, 15 for p = 3,7,11,15 (mod 16), respectively. 

Suppose that R(p) = R(q) for primes p,g € P. Since R(p) > 1 for p > 3, we have 
p.q > 3 and kp = lq = 1 (mod 16) for distinct k,! € {3,7, 11,15}. Since no prime 
is a multiple of 5, this is impossible when k or / is 15. Since {11,7, 3} are all prime and 
P,q > 3, the remaining cases are also impossible. 

If ¢ is in the range of R, then the prime divisors of 16t + 1 liein PU {5}. Since 13 divides 
16(13k + 4) + 1 for each k > O, none of the positive integers of the form 13k + 4 are in the 
range of R. 


Solved also by R. J. Chapman (U. K.), T. R. Hagedorn, R. Holzsager, N. Komanda, J. H. Lindsey II, O. P. Lossers (The 
Netherlands), L. E. Mattics, P. Venzke, A. N. ’t Woord (The Netherlands), NSA Problems Group, and the proposer. 


Commutative Algebra without Commutativity 


10437 [1995, 170]. Proposed by J. Maurice Rojas, University of California, Berkeley, 
CA, and AT&T Bell Laboratories, Naperville, IL. Let R be a ring (whose multiplica- 
tion is not necessarily commutative or associative) without zero divisors. Let x1,..., Xn 
be algebraically independent indeterminates over R that commute and associate amongst 
themselves and commute with the elements of R. Also assume the associative law for 
products of one element of R and two x;. Prove the following: 


(a) If f € R[x1,..., X,] is homogeneous, then any divisor of f is homogeneous. 

(b) If a1, ...,@, are nonzero elements of R and d),...,d, are nonnegative integers with 
gcd(dj,...,d,) = 1, then the polynomial on x f+oeee Hf nxt is irreducible in 
R[x1,.-., Xn], i. e., every factorization has at most one nonconstant factor. 


Solution by the proposer (current affiliation: Massachusetts Institute of Technology, Cam- 
bridge, MA). In this solution we assume some familiarity with basic convex geometry. Each 
nonnegative pout gq = (q1,.--,4n) € Z" determines (up to a constant multiple) a mono- 
mial CqXf! . x2" which we write as Cqgx?. Thus we may write each f € R[x1,..., Xn] as 
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f (*) = degezn Cqx4. Define the support supp f of f to be {q € Z” : cq # 0}. Define the 
Newton polytope N(f) of f to be the convex hull of supp f in R”. The vertices of a Newton 
polytope have integral coordinates. Let O be the origin in R”; observe that N~!(O) = R. 

We reduce our factorization questions to results on the decomposition of polytopes into 
vector sums by proving the following lemma. 


Lemma. For any product of polynomials gi, ..., gx, however it is associated, the Newton 
polytope equals N(g\) +--- + N(gx). 


Proof. This is immediate for k = 1 and the case k = 2 is the inductive step in the proof for 
general k. It remains to give a proof for k = 2. 

Multiplying monomials in x), ..., x, adds exponents, since the variables commute and 
associate among themselves. Thus supp g1g2 © supp g; + supp go, and so N(g;g2) 
N(g1) + N(g2). We need verify only that the coefficients of g; g2 corresponding to vertices 
of N(gi) + N(g2) are nonzero. 

Let v be such a vertex, and let w be a vector such that v is the only point of N(g1) + N(g2) 
minimizing the inner product (w, v). Since v is the vector sum of the faces of N(g1) and 
N(g2) with inner normals equal to w, it is clear that v = v,; + v2 where v; is a vertex of 
N(g;). Hence the corresponding monomial terms satisfy cyx” = (Cy, x"!)(Cy,x"2). Since 
R has no zero divisors, the coefficient c,, must be nonzero. ‘a 


To prove (a), note that f is homogeneous if and only if NF) is contained in a hyperplane 
normal to (1,..., 1) € R”. Hence every summand must also be contained in a hyperplane 
normal to (1, ..., 1); otherwise the Lemma implies that N (F’) would have a face not parallel 
to this hyperplane. 

Let P be the set of all integral polytopes contained in the closure of the positive orthant 
of R”. To prove (b), it suffices by the Lemma to show that NV (ax 5 ee Op x") iS 


indecomposable with respect to vector sum in P, i.e., if N (ax +.-++4 Oy x2") = P}+ Po 
for some P;, P2 € P, then either P, or P> consists only of the origin O. Let s be the 
number of nonzero exponents d;, and let t = min{s,n — 1}. Since N (ayx™! Se a 


yx") is the t-simplex whose vertex set is the scaled standard basis {d,é1, ..., d,é,} and 
gcd(d),...,d,) = 1, it suffices to prove the following theorem. 


Theorem. Suppose a polytope P € P satisfies the following three conditions: (i) All 2- 
dimensional faces of P are triangles; (11) PN {x : x; = 0} 4 @ foralli € {1,...,n}; and 
(iii) y P is not integral for any y € (0,1). Then P is indecomposable in P. 


Proof. By Theorem 3 of B. Griinbaum, Convex Polytopes, Interscience, 1969, p. 321, (i) 
implies that every summand of P in a vector sum is a nonzero point or is AP for some 
dX € [0,1]. By (ii), this implies that a decomposition of P in ? must have the form 
A, P +---+A,P, where dA; € (0,1) anda; +--- +A, = 1. By (iii), all but one of the A; 
are (0). 0 


Editorial comment. XK. S. Kedlaya proved (a) by first showing by induction on n that 
R[x1,..., Xn] has no zero divisors. This follows, as in section 18 of O. Zariski and P. Samuel, 
Commutative Algebra, Vol. 1, Van Nostrand, 1958, by writing polynomials as )> a;x' with 
a; € R[x1,...,Xn—1] and considering the terms of highest degree. Then, if f = g-h, 
arrange the terms of the factors by total degree. The product of the parts of smallest degree 
gives the part of f of smallest degree, and similarly for the parts of largest degree. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


The Sheer Joy of Celestial Mechanics. By Nathaniel Grossman. Birkhauser, Boston, 
1996. 181pp. 


Reviewed by J. M. Anthony Danby 


Many of us have felt, while teaching calculus, that the “applications” that are 
embedded into the traditional texts seem sterile, having little relation to the 
subject matter as it is applied outside the classroom. Professor Grossman has 
addressed this by assembling some topics from the field of celestial mechanics to 
show how the techniques of calculus can be “put to work.” He also shows how 
some topics that contemporary students may never encounter (Lagrange expan- 
sions, or Bessel functions, for instance) can be used in natural ways to solve 
problems. The text could be used in an elective course for juniors and seniors. It is 
clearly a labor of love: hence the title. 

The text opens with an introduction to Newtonian mechanics using vectors. 
There is some emphasis on rotating reference systems, illustrated by a discussion 
of the Foucault pendulum. Some of this material could be profitably included in 
the calculus sequence. 

The second chapter is concerned with “Central Forces,” that is, the motion of a 
particle subject to a force directed toward a fixed point. This includes discussions 
of theorems by Bonnet, on the superposition of different force fields, and by 
Hamilton, on the law of force necessary for a closed elliptic orbit. Bertrand’s 
theorem, proved in a later chapter, is also introduced: this states that the only 
central laws of force, kr’, proportional to a power of the distance from the center, 
for which all bounded orbits are closed, are those with p=1 or —2. This is 
important in arguing for the uniqueness of the inverse square law of gravitational 
attraction. It should be noted that although the theorems are stated and proved, 
the format is conversational and readable. Students allergic to “proofs” will find 
little to scare them. 

Celestial mechanics makes its entry in the third chapter with a discussion of 
orbits under the inverse square law. Here I felt some frustration. The only 
equations considered concern attraction to a fixed center, and in this context the 
author claims to “derive Kepler’s Third Law from Newton’s Law of Universal 
Gravitation.” Assuming a central force, the equation of motion for mass m, around 
mass m, is 

d*r r 
m5 = ~ Gm. 


But the assumption that the central body is unaccelerated violates Newtonian 
mechanics. If we accept Newton’s first and third laws, then a couple of minutes of 
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derivation produces the correct equation: 
d’r r 
dt? re 


From this is derived the period of an elliptic orbit with semimajor axis a: 


a3 
P=2 —______—_————_., 
. G(m, + m,) 


It is to this formula that astronomers usually refer when they mention ‘“Kepler’s 
third law,” and rightly so, since it is responsible for much of our knowledge of 
stellar masses. I find it a pity that only the pre-Newtonian form of the law should 
appear in this text. However, the basic variables and equations are clearly intro- 
duced and illustrated. The solution of Kepler’s equation is made an occasion for 
discussing the iteration x;,, = f(x;) for the solution of the equation x = f(x). 

An interesting variation appears in a discussion of motion if the constant of 
gravitation is gradually diminishing. There is some speculation that this is the case; 
it could have some interesting astrophysical and dynamical consequences. The 
equation to be solved can be written as 


ar ' r 

— = — et), 

dt* a ) r° 

where € is very small. This is discussed using a perturbation procedure. It may be 
of interest that if the equation considered is 


d’r mw or. dx wm x d’y a 


=~ = > 7 1.7», = = ST Oo = =, 
dt? l+atr dt? 1l+atr°’ dt? l+atr 


where r = yx’ +y’, 


then transformations, due to Meshchersky, to new coordinates, €, 7 and a new 
Cease 99 
time,” T: 


x y 1 
~~ 1+at’ "a(t at)’ 


produce 


dé Eé dn 1 
a nn oe ea ee where p = Vé* + 7’. 
dt p dt p 


Thus, the original problem can be discussed completely using the known solutions 
of the transformed equations. 

Chapter four, on expansions in elliptic motion, demonstrates that there is more 
to expansions than Taylor’s theorem! Lagrange’s expansion theorem, Bessel func- 
tions, Fourier series, and complex variables are all introduced and used. Some of 
the algebra is none too friendly, but the methods used are not advanced and the 
text is readable. 

Chapter five includes a discussion of Bertrand’s theorem and some application 
of differential geometry in the plane. 
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The remainder of the text deals with solid bodies and potential theory. We 
move from two dimensions to three. Chapter six includes a discussion of moments 
of inertia, Eulerian angles, and Euler’s equations, with applications to the conse- 
quences of the non-sphericity of the Earth on its rotation. Chapter seven is not so 
easy to follow, but is needed in part for later applications. It is concerned with 
gravitational potential of ellipsoids. Applications include a discussion of the 
precession of the equinoxes and the nutation. Finally, in chapter eight there are 
discussions of the tides, variation of gravity on the surface of the Earth, and 
ellipsoidal figures of rotating fluid bodies. Over the last century there has been 
beautiful work in this area by the likes of Poincaré, Darwin, Jeans, and 
Chandrasekhar; it is good to see some of this discussed here. 

The text includes many problems. Certainly, to start with, students will find 
these difficult, since they call for understanding and applying the mathematics, not 
just using formulas. Unfortunately in the latter part of the text, the number of 
problems falls off, due, I must add, to the nature of the material and not to any 
failure on the part of the author. 

In my opinion, the text does not constitute a good introduction to the subject 
matter of celestial mechanics. The author has made his own selection based on his 
personal taste; but too much standard elementary material has been omitted. 
Certainly the text would be unsuitable for any student concerned with gaining 
practical ability in celestial mechanics. For a student who might be interested in 
working on the mathematical reaches of the subject, J would recommend the text 
Mathematical Introduction to Celestial Mechanics by Harry Pollard, Mathematical 
Association of America, 1976. 

The problems are, indeed, old-fashioned, and there is a lot of nostalgia about 
this text. I think that the author’s rationale for writing it and using it are valid, and 
recommend its consideration for other potential instructors. But any instructor 
must know the material well and be committed to transmitting it as something to 
be enjoyed. It will be a personal matter; but have a look at the text and give it a 
chance. 


Department of Mathematics 
North Carolina State University 
Raleigh, NC 27695 
danby@math.ncsu.edu 


Strength in Numbers: Discovering the Joy and Power of Mathematics in Everyday Life. By 
Sherman K. Stein. John Wiley & Sons, 1996, xiii + 272, $24.95. 


Reviewed by Jennifer R. Galovich 


On a recent visit to the local McBookstore, and in a somewhat curmudgeonly 
mood, I decided to check out the range of mathematics and science offerings. I 
expected to come away with even more evidence of the general hopelessness of 
engaging the public imagination with respect to mathematics, sure that the shelves 
would be loaded with Stephen Jay Gould’s excellent books and that there would be 
little room left over for the “hard” sciences. 
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Here are the results of my completely unscientific survey: 


Category Number of titles 
Life Science 92 
Mathematics 85 
Physics 44 
Geology 19 
Chemistry 3 


Well! We don’t do so badly after all! I was pleasantly surprised to find that while 
Professor Gould was indeed well represented, many mathematics titles were also 
stocked. Moreover, there were quite a few more mathematics books classified as 
General Science, apparently depending on whether or not the word “mathematics” 
actually appeared in the title. While the volume under review was not yet on the 
shelves, it deserves to be. This book has something for almost everybody. 

Strength in Numbers is divided into three parts, the first of which is a com- 
pendium of essays for a very general audience. Two chapters on the seductive 
power of numbers are followed by a chapter on how, nevertheless, numbers can be 
an antidote to clever rhetoric. Stein also includes an unusual section on debunking 
myths—some quasi-historical (“As a boy, Einstein was poor in arithmetic.” [p. 42]), 
some cultural (‘Mathematics is a dead subject.” [p. 33]). Mathematicians who have 
heard such stories for years might be interested to see if they agree with Stein’s 
arguments. 

Educators in general, and high school teachers in particular, will find good uses 
for the extensive table describing the mathematical skills required for various 
occupations [pp. 65-68]. However the sections of most interest to educators and 
other readers of this journal are those in which Stein reviews the history of reform 
in mathematics education in this century. He pays special attention to the SMSG 
curriculum of the 1960s (we know how that came out) and the more recently 
developed NCTM Standards. Stein finds these standards ambitious—but are they 
more ambitious than they need to be? Stein’s views are controversial; given his 
longstanding interest and experience in education at all levels, they are also worth 
reading. 

In the second part, Professor Stein takes a fresh look at some topics accessible 
to anyone with two years of high school mathematics (Algebra I and Geometry). 
Beginning with the definition of a prime number, Stein offers the stories of the 
Mertens and Polya Conjectures as an object lesson in why it’s not enough just to 
check a few (or even a large number of) cases. A tutorial in which the reader 
“discovers” the formula for the sum of a geometric series is paired with a 
discussion of the money multiplier in economics. And a quick review of the 
arithmetic of fractions is followed by discussions of irrational numbers (including 
Euclid’s proof that y2 is irrational) and Cantor’s diagonalization argument. 

The final part makes more demands on the reader’s staying power. It consists of 
a quick tour through the derivative (“How Steep is a Curve?”) and the integral 
(“Finding a Curved Area”), which relies heavily on some earlier chapters. The 
section culminates in a proof due to Hindu mathematicians of the sum of 
Gregory’s series: 

T 1 1 1 
—=1-++5-5+"". 
4 3 5 #7 
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The proof is pretty strenuous for a general reader, but would be a good project for 
calculus students. 

Professor Stein is well-known as an excellent expositor. He has a gift for 
explaining a difficult concept by finding an image that carries the central idea. For 
example, his description of the Radon transform as a theoretical way of locating all 
the different fruits and nuts in a fruitcake without cutting it open is particularly 
nice. (Some readers may prefer that the fruitcake simply be transformed). 

There are a few places, however, where level could be a problem. For example, 
in the chapter entitled “The Mother of Invention” we move from the definition of 
whole numbers to the statement of Euler’s generalization of Fermat’s Little 
Theorem in less than a page. The required learning curve here is rather steep! 

In writing for a general audience it can be difficult to decide how much subtlety 
should be mentioned and how much should be airbrushed. In general, Stein 
handles such dilemmas well. The sum of a geometric series, for example, makes an 
early appearance and is used in several later chapters. How much should one say 
about what it means to “add up” infinitely many numbers? Stein disposes of this in 
one neat sentence, conveying the key idea without getting sidetracked. 

Purists may balk at Stein’s take on the “Let’s Make a Deal” problem. Although 
he steps over some of the subtleties (see, e.g., [1]), I found the presentation very 
effective for the intended audience. He recounts the original problem but not the 
answer, and promises readers that they can find the solution and explain it—they 
just need an opportunity and some guidance in thinking mathematically. Stein 
provides both by giving clear directions on how to simulate the problem and 
suggesting questions one might ask along the way. 

Sherman Stein has produced an eminently readable book that should appeal to 
a wide audience. The curmudgeonly view is that while Strength in Numbers is 
eminently readable, it doesn’t follow that it will actually be read. After all, while 
book sales are up, we suspect that book reading may be down. On the other hand, 
if we despair at the appearance of yet another eminently readable book and 
wonder who will read it, then perhaps we might think in a more focused way about 
how to change that state of affairs. Susan Landau’s editorial in the Notices of the 
American Mathematical Society [2] calls us to “rise to the challenge” implicit in 
the public’s increasing awareness of and interest in mathematics. Indirectly, she 
reminds us of our collective and individual responsibility as professionals to 
profess. Some of us profess in the classroom, others as writers, and most of us do 
some of both. More and more of us are involved in activities that help the public 
understand why mathematics is useful and exciting. So I applaud this new book; its 
appearance reminds us—in more than one way—that there really can be “strength 
in numbers.” 
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TELEGRAPHIC REVIEWS 


Edited by Arnold Ostebee 


with the assistance of the Mathematics Departments of 
Carleton, Macalester, and St. Olaf Colleges 


Telegraphic Reviews are designed to alert readers in a timely manner to new books 
appropriate to mathematics teaching and research. Special codes classify reviews by 
subject area and appropriate use: 

T : Textbook 

C ; Computer Software 


1-4: Semester 
** + Special Emphasis 
?? : Questionable 


P : Professional Reading 
L : Undergraduate Library 
13: Grade Level 


S : Supplementary Reading 
Readers are advised that price information is subject to change. Selected books 
receive a second, more extensive review in the Monthly. 


Books submitted for review should be sent to Book Reviews Editor, American Mathe- 
matical Monthly, St. Olaf College, 1520 St. Olaf Avenue, Northfield, MN 55057-1098. 


General, S(14—16), L*. 1/01 Careers in Math- 
ematics. Ed: Andrew Sterrett. MAA, 1996, x + 
260 pp, $20 (P). [ISBN 0-88385-704-9] A se- 
lection of informal, personal profiles reprinted 
from the “mathematician of the month” series 
that editor Sterrett produced for several years: 
ordinary math majors who work in a variety of 
careers, generally requiring only bachelor’s or 
master’s degrees. A terrific resource for under- 
graduates who want to know what they might do 
with a math major. Concludes with job-seeking 
advice reprinted from Math Horizons. LAS 


Finite Mathematics, T(13-14: 1). Finite 
Mathematics, Models, and Structure. William 
J. Adams. Kendall/Hunt, 1995, xii + 437 pp, 
$52.44 (P). [ISBN 0-7872-0995-3] Chatty, 
nicely written text. Topics: modeling, matrices, 
linear programming, probability and Bernoulli 
trials, introduction to game theory, logical de- 
duction, etc. RM 


Education, P, L*. The Nature of Mathemati- 
cal Thinking. Eds: Robert J. Sternberg, Talia 
Ben-Zeev. Stud. in Math. Thinking & Learn- 
ing Ser. Lawrence Erlbaum Associates, 1996, 
xiv + 335 pp, $34.50 (P); $69.95. [ISBN 0- 
8058-1799-9; 0-8058-1798-0] Why do some 
find mathematics easy, others painful? Why 
are some good at algebra, but terrible with ge- 
ometry? Why are some successful at busi- 
ness, but no good in mathematics? These 
eleven independently authored chapters attempt 
but fail to reach consensus in response to 
these questions. Their diverse perspectives— 
psychometric, cognitive, educational, cultural, 
and mathematical—have virtually nothing in 
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common. Yet as a whole they offer a wealth 
of hypotheses and insights. LAS 


History, P, L*. Modern Mathematics in 
the Light of the Fields Medals. Michael 
Monastyrsky. AK Peters, 1997, xv + 160 pp, 
$35. [ISBN 1-56881-065-2] Expanded En- 
glish translation of aremarkable paper first pub- 
lished in Russia in 1991 that provides concise 
(high-level) expository surveys of the mathe- 
matical work of every Fields medalist since the 
prize was first awarded in 1936. Includes his- 
torical background on Fields, on the prize, and 
on the selection process. Two appendices bring 
the work up-to-date. LAS 


Logic, P, L. Deviant Logic, Fuzzy Logic: Be- 
yond the Formalism. Susan Haack. Univ of 
Chicago Pr, 1996, xxvi + 291 pp, $18.95 (P); 
$55. {ISBN 0-226-31134-1; 0-226-31133-3] 
Revision of 1974 Cambridge University Press 
edition (TR, June—July 1975). Adds 5 papers 
on related topics and updates bibliography; re- 
maining material essentially unchanged. Ar- 
gues that while classical logic may need revi- 
sion, none of the proposed alternatives (e.g., 
quantum logics, modal logics, fuzzy logic) is 
satisfactory. Primarily philosophical, rather 
than mathematical, treatment. LB 


Logic, P, L. The Principles of Mathematics 
Revisited. Jaakko Hintikka. Cambridge Univ 
Pr, 1996, xii + 288 pp, $59.95. [ISBN 0-521- 
49692-6] Philosophical essay challenging ba- 
sic assumptions underlying standard first-order 
logic. Explores consequences of a new first- 
order logic (modifies handling of scope of quan- 
tifiers) for foundations of mathematics. Well- 
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written and accessible to undergraduates with 
some background in logic. LB 


Logic, P. Sentential Probability Logic: Ori- 
gins, Development, Current Status, and Techni- 
cal Applications. Theodore Hailperin. Lehigh 
Univ Pr, 1996, 304 pp, $43.50. [ISBN 0- 
934223-45-9] Studies a logic in which proba- 
bilities play a semantic role comparable to truth 
values in conventional logic. Includes a thor- 
ough historical survey and applications. LB 


Combinatorics, P. Codes, Designs and Geom- 
etry. Ed: Vladimir Tonchev. Kluwer Academic, 
1996, 120 pp, $115. [ISBN 0-7923-9759-2] 
Papers from a workshop held in 1994 at Michi- 
gan Technological University. 


Discrete Mathematics, P**, L*. Spectral 
Graph Theory. Fran R.K. Chung. CBMS Reg. 
Conf. Ser. in Math., No. 92. AMS, 1997, xi + 
207 pp, $25 (P). [ISBN 0-8218-0315-8] The 
spectrum of a graph is the set of eigenvalues of 
a particular matrix associated with the graph. 
This monograph presents the use of the graph 
spectrum to investigate properties and structures 
of a graph. Emphasizes important connections 
to geometry, but requires no special background 
in geometry. Beautifully clear exposition and 
careful attention to applications and to connec- 
tions with other areas of mathematics. LB 


Number Theory, T(14: 1), S*, L. A Path- 
way Into Number Theory, Second Edition. R.P. 
Burn. Cambridge Univ Pr, 1997, xv + 262 pp, 
$27.95 (P). [ISBN 0-521-57540-0] Introduc- 
tion to number theory through graded problems. 
New material on RSA codes, Gaussian integers, 
and triangular numbers. Expanded historical 
notes. (First Edition, TR, December 1982.) DB 


Linear Algebra, T(17: 2), L. Linear Algebra. 
Peter D. Lax. Pure & Appl. Math. Wiley, 1997, 
xiv + 250 pp, $54.95. [ISBN 0-471-11111-2] 
Graduate-level introduction to linear algebra. 
Standard topics as well as chapters on matrix in- 
equalities, kinematics and dynamics, convexity, 
the duality theorem, and positive matrices. LC 


Linear Algebra, T**(14: 1). Linear Alge- 
bra with Applications. Otto Bretscher. Pren- 
tice Hall, 1997, xiii + 587 pp. [ISBN 0-13- 
190729-8] Appealing order of topics: linear 
systems and transformations, subspaces of R”, 
orthogonality, determinants, eigenvectors, co- 
ordinate systems, differential equations, linear 
spaces. Uses dynamical systems as unifying 
theme. Full of nice applications. Theory moti- 
vated by examples. Great problems. TH 


Group Theory, S(14—-15). Laboratory Experi- 
ences in Group Theory. Ellen Maycock Parker. 
MAA, 1996, xi + 81 pp, $22 (P), with disk. 
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[ISBN 0-88385-705-7] 15 labs that encour- 
age students to make conjectures and to become 
comfortable with open-ended questions. “Ex- 
ploring Small Groups” software is on the ac- 
companying disk. LC 

Group Theory, T(18: 1), P, L. Lectures on 
Exceptional Lie Groups. J.F. Adams. Ed: Zafer 
Mahmud, Mamoru Mimura. Lect. in Math. 
Ser. Univ of Chicago Pr, 1996, xiv + 122 pp, 
$25 (P). [ISBN 0-226-00527-5] Detailed lec- 
tures given at Cambridge on the construction 
of exceptional Lie groups, their representations, 
and their interconnections. A sequel to Lectures 
on Lie Groups, J.F. Adams (1982). TH 


Algebra, T(14), L. Introductory Modern Alge- 
bra: A Historical Approach. Saul Stahl. Wi- 
ley, 1997, xii + 322 pp, $62.95. [ISBN 0-471- 
16288-4] Introduction to algebra with topics 
chosen according to historical development, fo- 
cusing on the problem of solvability by radi- 
cals. Chapters 1-3 set up solvability culminat- 
ing in a constructibility proof of Gauss; 4—7 
prove Galois’ primitive element theorem; 8-11 
concern group theory, especially relating per- 
mutation groups to solvability. Exercises; some 
with solutions. JD 


Algebra, T(18: 2), P. Integer-Valued Poly- 
nomials. Paul-Jean Cahen, Jean-Luc Chabert. 
Math. Surv. & Mono., V. 48. AMS, 1997, xix 
+ 322 pp, $75. [ISBN 0-8218-0388-3] If the 
integral domain D has quotient field K, this text 
studies the ring of polynomials with coefficients 
in K that map a subset of K into D. LC 


Algebra, P. Priifer Domains. Marco Fontana, 
James A. Huckaba, Ira J. Papick. Pure & Appl. 
Math., V. 203. Marcel Dekker, 1997, ix + 
328 pp, $150. [ISBN 0-8247-9816-3] Com- 
prehensive study of Priifer domains, including 
overrings, Dedekind domains, trace properties, 
and generalizations of Priifer domains. TH 


Algebra, P. Zariskian Filtrations. Li Huishi, 
Freddy van Oystaeyen. K—Mono. in Math., V. 2. 
Kluwer Academic, 1996, ix + 252 pp, $127. 
[ISBN 0-7923-4184-8] 


Algebra, T(15-16: 1), L. Rings, Fields, 
and Vector Spaces: An Introduction to Ab- 
stract Algebra via Geometric Constructibil- 
ity. B.A. Sethuraman. Undergrad. Texts in 
Math. Springer-Verlag, 1997, xiii + 190 pp, 
$34.95. [ISBN 0-387-94848-1] Uses geomet- 
ric constructibility as motivation. Intended as a 
One-semester abstract algebra course for future 
teachers. No group theory. Worth a look. LC 


Algebra, P. Commutative Ring Theory. Eds: 
Paul-Jean Cahen, et al. Lect. Notes in Pure 
& Appl. Math., V. 185. Marcel Dekker, 1997, 
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xii + 470 pp, $175 (P). [ISBN 0-8247-9815-5] 
Proceedings of the Second International Con- 
ference on Commutative Ring Theory held in 
Fés, Morocco. 


Algebra, P. Operads: Proceedings of Renais- 
sance Conferences. Eds: Jean-Louis Loday, 
James D. Stasheff, Alexander A. Voronov. Con- 
temp. Math., V. 202. AMS, 1997, ix + 443 pp, 
$85 (P). [ISBN 0-8218-0513-4] Papers from 
two 1995 events: a special session in Hart- 
ford, Connecticut, and a conference in Luminy, 
France. 


Calculus, T(13-14). Fundamentals of Cal- 
culus with Applications. William J. Adams. 
Kendall/Hunt, 1996, xiv + 391 pp, (P). [ISBN 
0-7872-1115-X] A standard introduction to 
calculus for students in business, et al. Well- 
written; excellent examples and clever illustra- 
tions. Worth considering. PF 


Complex Analysis, P. Extremal Riemann Sur- 
faces. Eds: J.R. Quine, Peter Sarnak. Con- 
temp. Math., V. 201. AMS, 1997, xz + 243 pp, 
$49 (P). [ISBN 0-8218-0514-2] Papers so- 
licited for an AMS Special Session at the 1995 
meeting in San Francisco. 


Differential Equations, T(15), L. Computa- 
tional Differential Equations. K. Eriksson, et 
al. Cambridge Univ Pr, 1996, xvi + 538 pp, 
$100; $44.95 (P). [ISBN 0-521-56312-7; 0- 
521-56738-6] Finite element methods for lin- 
ear ordinary and partial differential equations. 
Combines mathematical analysis and computa- 
tion. Develops topics of error estimation and 
adaptive error control throughout. AO 


Numerical Analysis, T(15: 2), L. Theory and 
Applications of Numerical Analysis, Second 
Edition. G.M. Phillips, P.J. Taylor. Academic 
Pr, 1996, xii + 447 pp, $24.95 (P). [ISBN 0- 
12-553560-0] Two new chapters: splines and 
other approximations; matrix eigenvalues and 
eigenvectors. Computing exercises have been 
added at the end of each chapter. (First Edition, 
TR, November 1974.) AO 


Numerical Analysis, P. Level Set Methods. 
J.A. Sethian. Cambridge Univ Pr, 1996, xviii 
+ 218 pp, $39.95. [ISBN 0-521-57202-9] 
State-of-the-art computational techniques for 
modeling the evolution of boundaries and in- 
terfaces in a variety of application areas (e.g., 
burning flames, ocean waves, medical imaging, 
grid generation). AO 


Functional Analysis, P. Functional Analy- 
sis. Eds: Susanne Dierolf, Se4n Dineen, Pawel 


ceedings of a 1994 workshop at Trier Univer- 
sity, Germany. 


Algebraic Geometry, P. Collected Papers of 
Giacomo Albanese. Eds: Ciro Ciliberto, Paulo 
Ribenboim, Edoardo Sernesi. Papers in Pure & 
Appl. Math., V. 103. Queen’s Univ, 1996, xii + 
182 pp, (P). [ISBN 0-88911-737-3] 


Differential Geometry, P. Sub-Riemannian 
Geometry. Eds: André Bellaiche, Jean-Jacques 
Risler. Prog. in Math., V. 144. Birkhauser 
Boston, 1996, viii + 393 pp, $84.50. [ISBN 0- 
8176-5476-3] 5 articles provide an introduc- 
tion to the field and survey the state-of-the-art. 


Algebraic Topology, T(18), P. Sheaf The- 
ory, Second Edition. Glen E. Bredon. Grad. 
Texts in Math., V. 170. Springer-Verlag, 1997, 
xi + 502 pp, $59.95. [ISBN 0-387-94905- 
4] Sheaf-theoretic cohomology from an alge- 
braic topology point-of-view. Assumes sub- 
stantial background in homological algebra and 
algebraic topology. Several cohomology theo- 
ries are compared, applications of spectral se- 
quences given. New edition includes Oliver 
transfer, Conner conjecture, intersection theory. 
Exercises; some with solutions. JD 


Game Theory, P, L*. Games of No Chance. 
Ed: Richard J. Nowakowski. Math. Sci. Res. 
Inst. Public., V. 29. Cambridge Univ Pr, 1996, 
xiii + 537 pp, $49.95. [ISBN 0-521-57411-0] 
Papers from a 1994 workshop on combinatorial 
game theory at MSRI in Berkeley. Includes in- 
troductory papers, papers on “classical” games 
(e.g., chess and Go), as well as others. Con- 
cludes with a list of open problems and a com- 
prehensive bibliography. 


Optimal Control, T(16-17). Optimal Con- 
trol: Basics and Beyond. Peter Whittle. Ser. 
in Systems & Optimiz. Wiley, 1996, ix + 
464 pp, $49.95 (P). [ISBN 0-471-95679-1] 
“Basics” includes classical topics (e.g., stabil- 
ity, feedback, controllability) as well as more 
general optimization techniques (e.g., dynamic 
programming, the Pontryagin maximum prin- 
ciple). “Beyond” topics: risk-sensitive and Hoo 
criteria; time-integral methods and optimal sta- 
tionary policies; near-determinism and large de- 
viation theory. AO 


Probability, S(13-15). Your Intuition Is 
Wrong! Marc T. Simon. Dorrance Pub, 1996, 
xi + 47 pp, $9 (P). [ISBN 0-8059-3834-6] 
Two dozen wordy word problems in elemen- 
tary probability embeded in gaming and casino 
contexts. Mostly standard types: birthday prob- 
lems, stopping strategies, runs, coin tossing, 


Domanski. Walter de Gruyter, 1996, xi + ums, etc. Concludes with brief explanations 
473 pp, DM 268. [ISBN 3-11-014617-7] Pro- atthe end. LAS 
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Mathematical Statistics, T(17~18: 1). Aspects 
of Statistical Inference. A.H. Welsh. Ser. in 
Prob. & Stat. Wiley, 1996, xviii + 451 pp, 
$59.95. [ISBN 0-471-11591-6] Includes sev- 
eral non-standard topics: robustness, random- 
ization, finite population inference, computa- 
tional methods based on simulation and smooth- 
ing methods. Simple data sets used to motivate 
topics and emphasize practical aspects of infer- 
ence. Assumes some distribution theory. RS 


Statistical Methods, S(17-18), P. Robust 
Statistical Procedures, Second Edition. Pe- 
ter J. Huber. CBMS-—NSF Reg. Conf. Ser. in 
Appl. Math., V. 68. SIAM, 1996, ix + 67 pp, 
$18.50 (P). [ISBN 0-89871-379-X] _ Brief, 
well-organized introduction and overview of ro- 
bust statistics (First Edition, TR, April 1978). 
This edition adds a chapter on recent develop- 
ments and updates the list of references. RS 


Statistical Methods, T(16-17: 1, 2). Anal- 
ysis of Variance, Design and Regression: Ap- 
plied Statistical Methods. Ronald Christensen. 
Chapman & Hall, 1996, xvi + 587 pp, $54.95. 
[ISBN 0-412-06291-7] Includes matrix for- 
mulation of regression and ANOVA models. 
Examples used to motivate theory rather than 
simply to illustrate techniques. Minitab output 
and commands incorporated throughout. Read- 
able. Accessible to students with weaker math- 
ematical background. RS 


Statistical Methods, S(18), P. Modern Multi- 
dimensional Scaling: Theory and Applications. 
Ingwer Borg, Patrick Groenen. Ser. in Stat. 
Springer-Verlag, 1997, xvii + 471 pp, $54.95. 
[ISBN 0-387-94845-7] Comprehensive pre- 
sentation of multidimensional scaling (MDS). 
Five parts: (1) introduction—useful for individ- 
uals with applied interests, (2) technical aspects 
of MDS, (3) unfolding as a special case of MDS, 
(4) geometry of MDS, (5) techniques/models 
closely associated with MDS. Appendix de- 
scribes major software packages. KB 


Algorithms, P. Proceedings of the Eighth An- 
nual ACM-SIAM Symposium on Discrete Al- 
gorithms. ACM & SIAM, 1997, x + 788 pp, 
$79.50 (P). [ISBN 0-89871-390-0] 

Theory of Computation, T?(16—17: 1), S, P. 
Communication Complexity. Eyal Kushilevitz, 
Noam Nisan. Cambridge Univ Pr, 1997, xiii + 
189 pp, $37.95. [ISBN 0-521-56067-5] Sum- 
mary of the mathematical development (with 


nice practical examples and motivation) of the. 


theory of what needs to be communicated and 
the amount of communication necessary to 
solve a problem. Begins with Yao’s two-party 
model, extended to more general models. RM 


Computer Science, P, L. Handbook of Ap- 
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plied Cryptography. Alfred J. Menezes, Paul 
C. van Oorschot, Scott A. Vanstone. Ser. on 
Disc. Math. & Its Applic. CRC Pr, 1997, 
XXvili + 780 pp, $79.95. [ISBN 0-8493-8523- 
7] An encyclopedia of practical cryptologic 
techniques (conventional and public-key). De- 
tailed descriptions of algorithms and extensive 
references to the research literature. AO 


Computer Science, P*, L**, Selected Papers 
on Computer Science. Donald E. Knuth. CSLI 
Lect. Notes, No. 59. Center for the Study of 
Language & Information (Leland Stanford Ju- 
nior Univ., Stanford, CA 94305) & Cambridge 
Univ Pr, 1996, xii + 274 pp, $69.95; $24.95 (P). 
[ISBN 1-881526-92-5; 1-881526-91-7] From 
the Preface: “This book assembles under one 
roof all of the things I’ve written about com- 
puter science for people who aren’t necessarily 
specialists in the subject.” 


Applications (Fluid Mechanics), P. Ho- 
mogenization and Porous Media. Ed: UI- 
rich Hornung. Interdisc. Appl. Math., V. 6. 
Springer-Verlag, 1997, xvi + 279 pp, $59.95. 
[ISBN 0-387-94786-8] 10 articles survey the 
method of homogenization applied to problems 
in porous media. 


Applications (Physical Science), P. Inverse 
Problems in Geophysical Applications. Eds: 
Heinz W. Engl, Alfred K. Louis, William Run- 
dell. SIAM, 1997, x + 303 pp, $81 (P). 
[ISBN 0-89871-381-1] Proceedings of a 1995 
GAMM-SIAM conference in Yosemite, Cali- 
fornia. 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
222: Control Using Logic-Based Switching. 
Ed: A. Stephen Morse. Springer-Verlag, 1997, 
viii + 276 pp, $54 (P). [ISBN 3-540-76097-0] 
23 papers from a 1995 workshop held on Block 
Island, Rhode Island. 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
208: Proceedings of Workshop on Advances 
in Control and its Applications. Eds: H.K. 
Khalil, J.H. Chow, PA. Ioannou. Springer- 
Verlag, 1996, xxiii + 319 pp, $59 (P). [ISBN 
3-540-19993-4] Papers from a 1994 event at 
the University of Illinois, Urbana-Champaign. 


Reviewers 


KB: Karla Ballman, Macalester, LB: Lynne Baur, Car- 
leton; DB: David Bressoud, Macalester; LC: Laura Chi- 
hara, St. Olaf; JD: Jill Dietz, St. Olaf; PF: Paul Froeschl, 
Macalester; TH: Tom Halverson, Macalester; RM: Richard 
Molnar, Macalester; AO: Arnold Ostebee, St. Olaf; RS: 
Richard Single, St. Olaf; LAS: Lynn Arthur Steen, 
St. Olaf. 
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University of Leeds, U.K., with a thesis supervised by J. T. Stafford. He now 
lectures at the Department of Pure Mathematics of the Federal University of Rio 
de Janeiro. His research interests include algebraic K-theory, non-commutative 
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TIBERIU TRIF received his M.S. degree in mathematics in 1995 at the Babes- 
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California, in Moraga, California, a Senior Researcher at the Center for the Study 
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States. He is the current editor of the MAA newsletter FOCUS and the author of 
Devlin’s Angle, a monthly column on MAA Online. 
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Rice University in 1975 and his Ph.D. from the University of Chicago in 1980, 
under the direction of Alberto P. Calderén. While a student at Rice and Chicago, 
he managed to take classes from both Salomon Bochner and Antoni Zygmund. He 
has been a faculty member at California State University, Long Beach since 1985. 


VIET NGO was born in 1956 in Hue, Vietnam. He received a B.S. in Mathematics 
from the University of Minnesota in 1976 and a Ph.D. from the University of 
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before coming to the U.S. He taught in the astronomy departments at the 
University of Minnesota and Yale before joining the department of mathematics at 
N.C. State. He is the author of texts and software in the areas of dynamical 
astronomy and numerical applications of differential equations. He was recently a 
winner in the “Computers in Physics” annual Educational Software Contest. In 
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JENNIFER GALOVICH received her undergraduate degree from Reed College in 
1969, Master’s degree from Brown University in 1972 and, with time off for good 
behavior, her Ph.D. from the University of Minnesota in 1993. Her research 
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LESTER R. FORD AWARDS FOR 1996 


The Lester R. Ford Awards, established in 1964, are made annually to authors of 
outstanding expository papers in the MONTHLY. The awards are named for Lester 
R. Ford, Sr., a distinguished mathematician, editor of the MONTHLY (1942-46), 
and President of the Mathematical Association of America (1947-48). 

Winners of the Lester R. Ford awards for expository papers appearing in 
Volume 103 (1996) of the MONTHLY are: 


Robert G. Bartle, Eastern Michigan University. 
Return to the Riemann Integral, pp. 625-632. 


Every calculus student sees the Riemann integral, but it is not flexible or 
general enough for more technical problems: not enough functions are 
Riemann integrable, and interchanging limits with a Riemann integral is 
difficult in the absence of uniform convergence. The Lebesgue integral 
handles these problems, but it requires the separate study of measure theory 
and still fails to include some important improper integrals. In this paper, 
Bartle shows that a generalized Riemann integral captures the advantages of 
both the Riemann integral and the Lebesgue integral without incurring the 
major disadvantages of either of the two classical approaches. Bartle points 
out that the generalized Riemann integral is more general than the Lebesgue 
integral, in the sense that the set of Lebesgue integrable functions is strictly 
contained in the set of generalized Riemann integrable functions. The author 
derives a strong form of the fundamental theorem of calculus and shows how 
measure theory may be recovered from the theory of generalized Riemann 
integrals. He demonstrates the advantages of the generalized Riemann 
integral in dealing with improper integrals and convergence theorems. By the 
end of the paper, Bartle has presented a strong argument for replacing the 
Lebesgue integral with the generalized Riemann integral. 


A. F. Beardon, University of Cambridge. 
Sums of Powers of Integers, pp. 201-213. 


The square of the sum of the numbers from 1 to the given number equals the sum 
of the cubes of the numbers from 1 to the given number: that was how Levi ben 
Gerson, in the 14th century, stated the identity that is the starting point for 
this paper. If we write o,(n) for the sum of the k-th powers from 1 to n*, this 
remarkable identity translates to o,(n) = o,(n)’. It cries out for generaliza- 
tion, and Beardon’s goal in this paper is to find all polynomial relations 
between o; and o,, for all 1 and j. He achieves this by interpreting the 
question in terms of the elementary theory of algebraic curves: a polynomial 
relation T(o;, o,) = 0 means that the curve defined by T(x, y) = 0 contains 
all the points (o,(1), o,(1)). On the way to the main theorem, we meet 
Bernoulli numbers, Faulhaber polynomials, and several basic results about 
algebraic curves. By immersing his question in a general theory, Beardon not 
only finds a complete answer, but also shows us the power and the charm of 
the theory of algebraic curves. 
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John Brillhart, University of Arizona, and Patrick Morton, Wellesley College. 
A Case Study in Mathematical Research: The Golay-Rudin-Shapiro Sequence, 
pp. 854-869. 


There’s a special feel to being on the trail of a discovery, and this paper 
conveys that feeling to its readers. The basic topic is just a simple sequence of 
1’s and —1’s, and the first observation is that in the first 32,000 steps there 
are always more 1’s than —1’s. Yet there is no discernible reason why this 
should be true. To understand it, Brillhart and Morton have to look deeper, 
experimentally detecting a “wave-like” pattern in the excesses and describing 
it precisely enough that its existence can be proved in general. But this is just 
the beginning. The excesses s(n) are observed to be about the same size as vn . 
Computation with special cases shows that the ratio s(n)/Vn can approach 
any value between (3/5 and 76; does it always stay in that small range? 
They piece together an argument for the upper bound and modify it to get 
the lower bound—until at the last step the method absolutely refuses to 
work. Undeterred, they look around for a new approach, finally discovering 
several “mysterious and elegant properties” that produce the final proof. 
Students who want to know what mathematical research feels like can find 
out here. 
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Constance Reid 


Constance Reid, an established writer about mathemati- 
cians, bas written an excellent and loving book, about ber 
sister Julia Robinson, the mathematician. The author has 
written that she wants the book to be one for all age 
groups and she bas succeeded admirably in making it 
so.. Julia wanted to be known as a mathematician, not a 
woman mathematician and rightly so! However, she was, 
and ts, a wonderful role model for women aspiring to be 
mathematician. What a great gift this book would be! 
—Alice Schafer, Former President, AWM 


This book is a small treasure, one which I want to share 
with all my mathematical friends. The assembly of sev- 
eral articles and additional photos and remarks provides 
the image of a mathematician of extraordinary taste, 
tenacity and generosity.... Julia Robinson broke ground 
in displaying the deep connections between number the- 
ory and logic. Her results have led to a very active area 
today, making the appearance of this book very timely. 
Her work and ber example are however timeless and I 
can think of no better advice to give a young mathe- 
matician, either in bow to do mathematics. or bow to 
behave in mathematics, than: “Be like Julia!” 

—Carol Wood, Deputy Director, MSRI 


In high school Julia Bowman stood alone as the 
only girl—and the best student—in her junior and 
senior math classes. She had only one close friend 
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and no boyfriends. Although she was to learn (from 
E. T. Bell’s Men of Mathematics) that there are such 
people as mathematicians, her ambition was merely 
to get a job teaching mathematics in high school. 


At great sacrifice her widowed stepmother sent her 
to the University of California at Berkeley to obtain 
the necessary teaching credentials. But at Berkeley, 
in a society of mathematicians, she discovered her- 
self. She was not the duckling that didn’t belong, but 
a swan. There was also a prince at Berkeley, a bril- 
liant young assistant professor named Raphael 
Robinson. Theirs was to be a marriage that would 
endure until her death in 1985. 


Julia is the story of the life of Julia Bowman Robinson, 
the gifted and highly original mathematician who dur- 
ing her lifetime was recognized in ways that no other 
woman mathematician had been recognized up to that 
time. In 1976 she became the first woman mathemati- 
cian elected to the National Academy of Sciences and 
in 1983 the first woman elected president of the 
American Mathematical Society. 


This unusual book, profusely illustrated with previ- 
ously unpublished personal and mathematical memo- 
rabilia, brings together in one volume the prizewin- 
ning “Autobiography of Julia Robinson” by her sister, 
the popular mathematical biographer Constance 
Reid, and three very personal articles about her work 
by outstanding mathematical colleagues. 


All royalties from sales of this book will go to fund a 
Julia Robinson Prize in Mathematics at the high 
school from which she graduated. 
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ERIVE is the trusted mathe- —— «cutest einen en 


matical assistant relied upon by mm the freedom to explore different 

y students, educators, ‘f mathematical approaches better 
engineers, and scientists around and more quickly than by using 
the world. It does for algebra, traditional methods. 


| | i 
equations, trigonometry, vectors, System Requirements: 


matrices, and calculus what the Windows 95, 3.1x or NT running 


scientific calculator does for 
numbers — it eliminates the on a computer with 8 megabytes 
of memory. 


drudgery of performing long and 
tedious mathematical calcula- Suggested Retail Price: $250. 
Educational pricing available. 


tions. You can easily solve both 

symbolic and numeric problems For product information and list of 
and see the results plotted as 2D or dealers, fax, email, write, or call Soft 
3D graphs. Warehouse, Inc. or visit our website at 
For everyday mathematical work DERIVE — http://www.derive.com. 

is a tireless, powerful, and knowl- 


edgeable assistant. For teaching or The Easiest just got Easier. 
learning mathematics, DERIVE gives you) 


Soft Warchouse: Soft Warehouse, Inc. « 3660 Waialae Avenue 
HONOLULU-HAWAII Suite 304 * Honolulu, Hawaii, USA 96816-3259 

© 1996 Soft Warehouse, Inc DERIVE is a registered trademark of Soft Warehouse, Telephone: (808) 734-5801 after 10:00 a.m. PST 

Inc Other trademarks are the property of their respective owners Fax: (808) 735-1105 * Email: swn@aloha.com. 
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A lab manual with software for introductory courses 
in group theory or abstract algebra 


Laboratory Experiences in Group Theory is a workbook 
of 15 laboratories designed to be used with the software 
Exploring Small Groups as a supplement to the regular 
textbook in an introductory course in group theory or 
abstract algebra. Written in a step-by-step manner, the 
laboratories encourage students to discover the basic 
concepts of group theory and to make conjectures from 
examples that are easily generated by the software. 
The labs can be assigned as homework or can be used 
in a structured laboratory setting. Since the software is 
user-friendly and the laboratories are complete, stu- 
dents and faculty should have no difficulty in using the 
labs without training. 


Most students find that the laboratories provide an 
enjoyable alternative to the “theorem-proof-example” 
format of a standard abstract algebra course. At the end 
of the semester, one student wrote in his evaluation of 
the course: 


I am truly grateful for the laboratory component...Work 
on the computer helped to make the abstract theory 
more concrete... One of the best things about the labs 
was that we formed our own conjectures about the pat- 
terns we saw...I believe that the progression of (1) lab, 
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Laboratory Experiences 
in Group Theory 


A Manual to be Used with 
Exploring Small Groups 


Ellen Maycock Parker 


Series: Classroom Resource Materials 


(2) conjecture, (3) class discussion, and (4) proof was 
highly beneficial in gaining understanding of the 
abstract material of the course. 


Table of Contents: 1. Groups and Geometry; 2. Cayley 
Tables; 3. Cyclic Groups and Cyclic Subgroups; 4. 
Subgroups and Subgroup Lattices, 5. The Center and 
Commutator Subgroups; 6. Quotient Groups; 7. Direct 
Products; 8. The Unitary Groups; 9. Composition 
Series; 10. Introduction to Endomorphisms; 11. The 
Inner Automorphisms of a Group; 12. The Kernel of an 
Endomorphism, 13. The Class Equation; 14. Conjugate 
Subgroups; 15. The Sylow Theorems; Appendix A. 
Table Generation Menu of Exploring Small Groups 
(ESG),; Appendix B. Sample Library of ESG; Appendix 
C. Group Library of ESG; Appendix D. Group 
Properties Menu 


Exploring Small Groups, the software packaged with 
this lab manual, is on a 34/2” DD PC compatible disk. 
This is a DOS program that can be run in Windows. 
The software was developed by Ladnor Geissinger, 
University of North Carolina at Chapel Hill. 
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Information Flow 
The Logic of Distributed Systems 

Jon Barwise and Jerry Seligman 

The authors, observing that information flow is pos- 
sible only within a connected distribution system, 
provide a mathematically rigorous, philosophically 
sound foundation for a science of information. 
They illustrate their theory by applying it to a wide 
range of phenomena, from file transfer to DNA, 
from quantum mechanics to speech act theory. 
Cambridge Tracts in Theoretical Computer Science 44 
1997 ¢.256pp. 58386-1 Hardback $39.95 


Combinatorics of 

Finite Geometries 

Second Edition 

Lynn Margaret Batten 

The revised edition contains an entirely new chapter 
on blocking sets in linear spaces, which highlights 
some of the most important applications of blocking 
sets—from the initial game-theoretic setting to their 
very recent use in cryptography. 


1997 207 pp. 59014-0 Hardback $64.95 
59993-8 Paperback $24.95 


Financial Calculus 

An Introduction to Derivative Pricing 

Martin Baxter and 

Andrew Rennie 

With mathematical precision and in a style tailored 
for market practitioners, the authors describe key 
concepts such as martingales, change of measure, and 
the Heath-Jarrow-Morton model. They also provide 
a full glossary of probabilistic and financial terms. 
1996 §=242 pp. 55289-3 Hardback $39.95 


Calendrical Calculations 

Nachum Dershowitz and 

Edward M. Reingold 

In this book the authors present simple algorithms for 

calendrical calculations, carefully coupled with deep 

and insightful research results in the general areas of 

algorithms touched on by such manipulations. The 

material will be supplemented with code to imple- 

ment many of the algorithms and prefaced by an 

introduction to the world’s calendars. 

1997 c.160pp. 56413-1 Hardback $64.95 
56474-3 Paperback $22.95 


Available in bookstores or from 
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Polyhedra 
P. Cromwell 
This book compre- 


POLYTER DRA 


hensively documents 
the many and varied 
ways that polyhedra 
have come to the 
fore throughout 

the development 

of mathematics. 
1997 464 pp. 


55432-2 Hardback $44.95 


The Pleasures of Counting 
T. W. Korner 

The author uses relatively simple terms and ideas, 
yet explains difficulties and avoids condescension. 
If you are a mathematician who wants to explain to 
others how you spend your working days, then seek 
inspiration here. 
1996 544 pp. 56087-X Hardback $59.95 
56823-4 Paperback $34.95 


Finite Fields 

Second Edition 

Rudolf Lidl and 

Harald Niederreiter 

This updated second edition is devoted entirely to the 
theory of finite fields, and it provides comprehensive 
coverage of the literature. Worked examples and lists 
of exercises throughout the book make it useful as a 
text for advanced level courses for students of algebra. 
Encyclopedia of Mathematics and its Applications 20 
1997 769 pp. 39231-4 Hardback $95.00 


Thinking About Ordinary 
Differential Equations 

Robert E. O'Malley, Jr. 

This book stresses alternative examples and analyses 
by means of which students can understand a number 
of approaches to finding solutions and understanding 
their behavior. 

Cambridge Texts in Applied Mathematics 18 


1997 9257 pp 55314-8 Hardback $69.95 
°55742-9 Paperback $24.95 
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This is a revised, updated and augmented edition of 
a classic Carus monograph (a bestseller for over 25 
years) on the theory of functions of a real variable. 
Earlier editions of this classic Carus Monograph cov- 
ered sets, metric spaces, continuous functions, and 
differentiable functions. The fourth edition adds sec- 
tions on measurable sets and functions, the Lebesgue 
and Stieltjes integrals, and applications. The book is 
accessible to readers with some mathematical sophis- 
tication and a background in calculus. It is suitable 
either for self-study or for supplemental reading in a 
course on advanced calculus or real analysis. 


Not intended as a systematic treatise, this book has 
more the character of a sequence of lectures on a 
variety of topics connected with real functions. 
Many of these topics are not commonly encountered 
in undergraduate textbooks: for example, the exis- 
tence of continuous everywhere-oscillating functions 
(via the Baire category theorem); two functions hav- 
ing equal derivatives, yet not differing by a constant; 
application of Stieltjes integration to the speed of 
convergence of infinite series. 
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A Primer of Real 
ome Functions 


“by Ralph P. Boas 
Revised and updated by Harold P. Boas 


Series: Carus Mathematical Monograph 


Table of Contents: 

I. Sets: Sets of real numbers, Countable and uncount- 
able sets, Metric spaces, Open and closed sets, Dense 
and nowhere dense sets, Compactness, Convergence 
and completeness, Nested sets and Baire’s theorem, 
Some applications of Baire’s theorem, Sets of mea- 
sure zero. II. Functions: Functions, Continuous func- 
tions, Properties of continuous functions, Upper and 
lower limits, Sequences of functions, Uniform con- 
vergence, Pointwise limits of continuous functions, 
Approximations to continuous functions, Linear func- 
tions, Derivatives, Monotonic functions, Convex func- 
tions, Infinitely differentiable functions. III. 
Integration: Lebesgue measure, Measurable functions, 
Definition of the Lebesgue integral, Properties of 
Lebesgue integrals, Application of the Lebesgue inte- 
gral, Stieltjes integrals, Applications of the Stieltjes 
integral, Partial sums of infinite series. 
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Lion Hunting 


and Other Mathematical Pursuits 


A Collection of Mathematics, Verse, and Stories 


by Ralph P. Boas, Jr. 


Gerald L. Alexanderson and 
Dale H. Mugler, Editors 


I highly recommend Lion Hunting and Other 
Mathematical Pursuits to high school mathematics 
clubs, mathematics teachers of all levels, and anyone 
interested in mathematics. Perhaps the most impor- 
tant features of this book is how it subtly makes the 
reader aware of the nature of mathematics. 


~ The Mathematics Teacher 


As a young man at the Institute for Advanced Study in 
Princeton, Ralph Philip Boas, Jr., together with a group of 
other mathematicians, published a light-hearted article on 
the “mathematics of lion hunting” under a pseudonym 
(1938]. This sparked a sequence of articles on the topic, 
several of which are drawn together in this book. 


Lion Hunting includes an assortment of articles that show 
the many facets of this remarkable mathematician, editor, 
writer, and teacher. Along with a variety of his lighter 
mathematical papers, the collection includes Boas’ verse 
and short stories, many of which are appearing for the first 
time. Anecdotes and recollections of his numerous experi- 
ences and of his work and meetings with many distin- 
guished mathematicians and scientists of his day are also 
included as well as photographs taken by Boas of Hardy, 
Littlewood, Besicovitch, Weil, and others. 


The mathematical articles in this collection cover a range 
of topics. They include articles on infinite series, the mean 
value theorem, indeterminate forms, complex variables, 
inverse functions, extremal problems for polynomials and 
more. A special section of this book is devoted to articles 
about the teaching of mathematics, with titles such as 


LION HUNTING & OTHER 
MATHEMATICAL PURSUITS 


A COLLECTION OF MATHEMATICS, 
VERSE, AND STORIES BY 
RALPH P. BOAS, JR. 


GERALDO L ALEXANDERSON, 
DALE . 


“Calculus as an experimental science” and “Can we make 
mathematics intelligible?” 


Boas’s wit and playful humor are reflected in the verses 
included in this collection. The verses reflect the phases of 
his career as author, editor, teacher, department chair, and 
lover of literature. A section of the book describes the feud 
that Boas supposedly had with Bourbaki. Also included are 
many amusing anecdotes about famous mathematicians. 
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Vita Mathematica 


Historical Research and 
Integration with Teaching 


Ronald Calinger, Editor 


The use of the history of mathematics in the 
teaching of mathematics at all levels is an idea 
whose time has come. To use history in the 
teaching of undergraduate mathematics, the 
instructor must be familiar with the history as 
well as the mathematics. Vita Mathematica will 
enable college teachers to learn the relevant histo- 
ry of various topics in the undergraduate curricu- 
lum and help them incorporate this history in 
their teaching. 

For example, should calculus be approached 
from a geometric or an algebraic point of view? 
The book shows us how two important eigh- 
teenth century mathematicians, Colin Maclaurin 
and Joseph-Louis Lagrange, understood the calcu- 
lus trom these different standpoints and how their 
legacy is still important in teaching calculus 
today. We also learn why Lagrange’s algebraic 
approach dominated teaching in Germany in the 
nineteenth century. Some of the reasons for this 
are related to the appropriate foundations of the 
calculus, and so the book traces the ancient histo- 
ry of one of the possible foundations, the concept 
of indivisibles. Even though we generally do not 
use this concept formally today, many ideas for a 
heuristic approach to the calculus can be devel- 
oped out of his study. 

Vita Mathematica contains numerous other 
articles dealing with calculus, with algebra, com- 
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binatorics, graph theory, and geometry, as well as 
more general articles on teaching courses for 
prospective teachers. 

This volume, then, demonstrates that the his- 
tory of mathematics is no longer tangential to the 
mathematics curriculum, but in fact deserves a 
central role. 


Catalog Code: NTE40 
350 pp., Paperbound, 1996, ISBN 0-88385-097-4 
List: $34.95 MAA Member: $29.00 


ORDER FROM: 
THE MATHEMATICAL ASSOCIATION OF AMERICA 
P.O. Box 91112, Washington, DC 20090-1112 


1-800-331-1622 


Address 
City 


State Zip 


(301) 617-7800 FAX (301) 206-9789 


QTY CATALOG CODE PRICE AMOUNT 
NTE40 
TOTAL 
Payment (] Check (© VISA (1) MasterCard 
Credit Card No. Expires / 
Signature 


The Lighter Side 
of Mathematics 


Proceedings of the Eugéne Strens Memorial Conference 
on Recreational Mathematics and its History 


Richard K. Guy and 
Robert E. Woodrow, Editors 


The level of exposition is high, and the fun infectious. 
The reader can find routes to serious mathematics, 
such as hyperbolic geometry, fractals, group theory, 
‘and number theory, all beginning with a delightful 
puzzle. A sparkling addition for any library where the 
lover of mathematics at any level comes for support. 
—Choice 


The book is a fantastic feast of far-from-trivial topics. 
Entertaining mathematics not only can lead to unexpect- 
ed applications...but it is one of the best ways to stimu- 
late interest in mathematics among both students and 
the general public. 

—Martin Gardner, American Scientist 


In August of 1986 a special conference on recreational 
mathematics was held at the University of Calgary to 
celebrate the founding of the Strens Collection. Leading 
practitioners of recreational mathematics from around 
the world gathered in Calgary to share with each other 
the joy and spirit of play that is to be found in recreation- 
al mathematics. 


The papers in this volume represent a treasure trove of 
recreational mathematics by a star-studded cast: Leon 
Bankoff, Elwyn Berlekamp, H.S.M. Coxeter, Ken Falconer, 
Branko Griinbaum, Richard Guy, Doris Schattschneider, 
David Singmaster, Athelstan Spilhaus, Stan Wagon and 
many others. 


If you are interested in tessellations, Escher, tiling, 
Rubik’s cube, pentominoes, games, puzzles, the arbelos, 
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The MAA is proud to reissue Martin 
Gardner’s Penrose Tiles to Trapdoor Ciphers, 
printed with a new bibliography, correc- 
tions to the text, and a postscript from the 
author. Penrose Tiles assembles a collection 
of Gardner’s “Mathematical Games” columns 
from Scientific American that include many 
of the problems, puzzles and paradoxes 
that have earned him a reputation as a 
master mathematical magician. 


Included here are chapters on Conway’s 
surreal numbers, Mandelbrot’s fractals, and 
Smullyan’s logic puzzles, as well as puzzlers 
dealing with hyperbolas, negative numbers, 
pool-ball triangles, and Penrose tiles and 
trapdoor ciphers. And of course, you can 
read of the return of Dr. Irvine Joshua 
Matrix, (famed numerologist and CIA 
operative), one of Martin Gardner’s oldest 
fictional friends. 


Penrose Tiles to 
Trapdoor Ciphers 


... and the Return of Dr. Matrix 


MARTIN GARDNER 
A reissue of another Gardner classic 
Series: Spectrum 


Read what reviewers have said about Penrose Tiles to Trapdoor 
Ciphers ... 


The scope is extraordinary ... Those fortunate enough to have 
encountered Gardner's columns in their original appearance 
can look for personal bonuses of reminiscence as they read this 
book ... Gardner is one of history’s great figures of recreational 
mathematics. —New Scientist 


Penrose Tiles to Trapdoor Ciphers is invaluable to those interested 
in recreational mathematics and should enlighten those who 
consider such activity to be difficult or boring. 

—The Mathematics Teacher. 


No popular mathematical writer has ever matched Gardner's 
breadth and richness of knowledge and clarity of style, and this 
book is up to bis usual unsurpassable standard. 

—American Scientist 
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When You Order Through 
the AMS Bookstore! 


For a limited time, you can enjoy additional 
savings on these best-selling books and other 
selected titles when you order online via the 
AMS Bookstore. The bookstore now includes 
the entire backlist of AMS titles—over 2300 
books in print! Go to www.ams.org/bookstore 
and take advantage now of these Web-only 
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A Primer of Mathematical 
Writing 

Steven G. Krantz, Wasiiington University, 
St. Louis, MO 


This book is about writing in the professional 
mathematical environment. There are few peo- 
ple equal to this task, yet Steven Krantz is one 
who qualifies. While the book is nominally 
about writing, it’s also about how to funetion 
in the mathematical profession. In many ways, 
this text complements Krantz’s previous best- 
seller, Hote fo Teach Mathematics. Those who are 
familiar with Krantz’s writing will reeognize 
his lively, inimitable style. 


In this volume, he addresses these nuts-and- 
bolts issues: 

¢ Syntax, grammar, structure, and style 

¢ Mathematical exposition 

¢ Use of the computer and TpXxX 

¢ E-mail etiquette 

¢ All aspects of publishing a journal article 


Krantz’s frank and straightforward approach 
makes this particularly suitable as a textbook. 
He does not avoid difficult topics. His intent is 
to demonstrate to the reader how to success- 
fully operate within the profession. He out- 
lines how to write grant proposals that are 
persuasive and compelling, how to write a let- 
ter of recommendation describing the research 
abilities of a candidate for promotion or 
tenure, and what a dean is looking for in a let- 
ter of reeommendation. He further addresses 
some basic issues such as writing a book pro- 
posal to a publisher or applving for a job. 
Readers will find in reading this text that 
Krantz has produced a quality work which 
makes evident the power and significance of 
writing in the mathematics profession. 

1997; 223 pages; Softcover; ISBN 0-8218-0635-1; 

List $19; AIL AMS members $15; Order code 
PMW\IM78 


savings (valid until December |, 1997). 


Techniques of Problem Solving 


Steven G. Krantz, Washington University, 
St. Louis, MO 


.. the subject of problem solving, as viewed by this 
author, is more than just a disconnected list of 
brain teasers and recreations, [tis a way of life. 
Scientists of every stripe—chemists, plrysicists, 
psychologists, social engineers, and many otlers— 
ply their trade by considering a set of data, dec id- 
ing what techniques are relevant to these data, and 
then solving a problem. If is this view of problem 
solving that will be promulgated in the present 
book. —from the Preface 
The purpose of this book is to teach the basic 
principles of problem solving, including both 
mathematical and nonmathematical problems. 
This book will help students to ... 


* translate verbal discussions into analytical 
data. 


¢ learn problem-solving methods for attacking 
collections of analytical questions or data. 


¢ build a personal arsenal of internalized prob- 
lem-solving techniques and solutions. 


‘ 


*¢ become “armed problem solvers”, ready to 
do battle with a varietv of puzzles in differ- 
ent areas of life. 


‘Taking a direct and practical approach to the 
subject matter, Krantz’s book stands apart 
from others like it in that it incorporates exer- 
cises throughout the text. After many solved 
problems are given, a “Challenge Problem” is 
presented. Additional problems are ineluded 
for readers to tackle at the end of each chapter. 
There are more than 330 problems in all. A 
Solutions Manual to most end-of-ehapter exer- 
cises is available. 

1997; 465 pages; Softcover; ISBN 0-8218-0619-X; 

List $29; AI AMS members $23; Order code 
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important problems of their day. The text is written in 
modern style and notation, with an emphasis on the peda- 
gogical factor throughout. It develops calculus from 
within the relevant historical context, and is suitable for a 
variety of audiences, from arts and letters to science students. 
Over 600 exercises let students sharpen their mathematical 
thinking and skills. 


Part 1: From Archimedes to Newton: The Greeks Measure 
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DISCRETE PROBABILITY 


Discrete Probability is a textbook, at a post-calculus level, 
for a first course in probability. Since continuous probability 
is not treated, discrete probability can be covered in greater 
depth. The result is a book of special interest to students 
majoring in computer science as well as those majoring in 
mathematics. Since calculus is used only occasionally, stu- 
dents who have forgotten calculus can nevertheless easily 
understand the book. The slow, gentle style and clear expo- 
sition will appeal to students. Basic concepts such as count- 
ing, independence, conditional probability, random variables, 
approximation of probabilities, generating functions, ran- 
dom walks and Markov chains are presented with good expla- 
nation, many worked exercises, and an abundance of 
problems. Throughout the book, various comments on the 
history of the study of probability are inserted. Biographical 
information about some of the famous contributors to prob- 
ability such as Fermat, Pascal, the Bernoullis, DeMoivre, 
Bayes, Laplace, Poisson, Markov, and many others, is pre- 
sented. This volume will appeal to a wide range of readers 
and be useful in many undergraduate programs. 


1997 /APP. 256 PP./HARDCOVER/$39.95 ISBN 0-387-98227-2 
UNDERGRADUATE TEXTS IN MATHEMATICS 


Forthcoming — 


ELIAS DEEBA and ANANDA GUNAWARDENA, both of University of 
Houston-Downtown, TX 


INTERACTIVE LINEAR 
ALGEBRA WITH MAPLE V 


1997/APP. 288 PP./$45.95 (TENT.)/SOFTCOVER/ISBN 0-387-98240-X 
TEXTBOOKS IN MATHEMATICAL SCIENCES 


NANCY BAXTER HASTINGS, Dickinson College, Carlisle, PA 


WORKSHOP CALCULUS 


Guided Exploration with Review 
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cepts with the study of concepts 
encountered in a traditional first 
semester calculus course: func- 
tions, limits, derivatives, inte- 
grals, and an introduction to 
integration techniques. This two- 
course sequence is designed for 
students who are not prepared to 
enter Calculus I, but who need 
to develop mathematical skills for further study in the social 
sciences, natural sciences, or mathematics. Essential elements 
of Workshop Calculus include the emphasis on applications 
to enhance student motivation and the use of computers and 
graphing calculators to help explore mathematical ideas. 
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Areas and Intersections in Convex Domains 


Norbert Peyerimhoff 


1. INTRODUCTION. The following considerations were inspired by my browsing 
in the sections about “Bertrand’s Paradox” and “Buffon’s Needle Problem” of the 
nice introductory book The Pleasures of Probability [4]. Both sections discuss 
problems of an ancient field of mathematics called geometric probability. 

In some of the subsequent nights I couldn’t immediately fall asleep and started 
to play around with randomly chosen line segments in a bounded convex domain 
and the probability that they intersect. To avoid ambiguities I have to explain that 
a straight line segment (A, B) is to be chosen randomly by an independent and 
uniform random choice of its endpoints A and B in the convex domain. Of course, 
there are many other methods to choose line segments, which can yield different 
results for this probability. 

Generally, two line segments (A, B) and (C, D) intersect only if they are the 
diagonals of a convex quadrilateral spanned by the points A, B,C, D. If one of the 
points A, B,C, D lies properly inside the convex hull of the other three the two 
line segments cannot intersect. This stmple observation connects our problem with 
an old problem posed by Sylvester more than a century ago: Given a bounded 
convex domain K < R’, what is the probability p, that four points independently 
chosen in K span a convex quadrilateral? The connection between these apparently 
different problems was also important in the recent Monthly article [12], which 
proved that two fundamental constants of planar geometry and geometric probabil- 
ity are equal. The considerations of our article are also related to Sylvester’s 
problem. 

In Section 2 we derive a somewhat unexpected relationship between areas of 
particular subsets of an arbitrary bounded convex domain K_ by interpreting 
Sylvester’s probability p, in two different ways. It is a good example of creative 
interplay between geometry and probability. 

Section 3 deals with a higher-dimensional analogue of the original question: 
Assume a triangle and a line segment are chosen at random in B’, the 3-dimensional 
unit ball. The triangle is determined by choosing three spanning vertices and the line 
segment is determined by choosing its end points. What is the probability that they 
intersect? The solution to this problem is based on two results. We use a theorem 
of Kingman, which treats a generalization of Sylvester’s problem in higher dimen- 
sions. We also use a simple case of a fundamental theorem in convex geometry 
called Radon’s theorem. It states that any set of n + 2 points in R” can always be 
partitioned into two subsets V,, V, such that the convex hulls of V, and V, 
intersect. 

Henceforth, convLX,,..., X,,) denotes the convex hull of the points X,,..., X;, 
© R” and affCX,,..., X,) denotes the smallest affine subspace containing the 
points X,,..., X,. We say the points X,,..., X, € R” are in general position, if 
any affine subspace P C R” of dimension r < n contains at most r + 1 points. The 
n-dimensional unit ball is denoted by B”. 
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2. A STRANGE RELATIONSHIP BETWEEN AREAS AND THE PROBLEM OF 
SYLVESTER. For the following considerations we fix a convex bounded domain 
K Cc R* and introduce two subsets A X,X,X, and ©X,X,X, associated with 
three points X,, X,, and X, of K (see the shaded domains in Figure 1). Let us 
compare the behavior of area(a X,X,X,) and area(OX,X,X,) as the points 
X,, X,, X; move around. The shape of K influences area(a X,X,X,) only by 
forcing the points X,, X,, X3 to stay inside K. On the other hand, area(OX,X, X;) 
Is very sensitive to the shape of K since OX,X,X, and K have common pieces of 
boundary. Moreover, moving the points X,, X,, X, closer to the boundary of K 
generally increases area( A X,X,X,) and decreases area(OX,X,X;). 


Figure 1 


Although both areas behave very differently, there is a surprising connection 
between them after taking averages. Independent of the shape of K, the average 
area of OX,X,X, is exactly three times the average area of A X,X,X;, as 
described precisely in the following proposition. 


Proposition 1. Let K Cc R? be a given bounded convex domain. Then 
1 
——j | area(©X,X, X;) dX, dX, dX; 
area( KK)" “KxKxK 
3 
weak) 
area( K)° “KxKxK 


The two normalizéd integrals in (1) are, respectively, what we mean by the average 
areas of OX,X,X, and A X,X,X;. 


area( A X,X,X,) dX,dX,dX,. (1) 


The main idea of the proof is to express a particular probability in two different 
ways. Let 


__ probability that the convex hull of four independently 


« chosen points X,,..., X, © K isa triangle. (2) 


Bertrand’s paradox tells us that one has to be careful in using probabilities for 
geometric settings. We avoid these difficulties by restricting ourselves only to 
independent and uniform choices of points in the convex bounded domain K. Any 
choice of 4 points corresponds to an element in the product space K X K X K X K. 
By using Fubini’s theorem we can reduce the situation to the probability that a 
randomly chosen point lies inside some measurable subset A Cc K. This probability 
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is given by area(A)/area(K), which corresponds to the quotient 


# good cases 


# all cases 


in the case of finite sets. This relationship between probabilities and areas is 
crucial in the proof. 


Proof: Let X,,..., X, be four independently chosen points in K. We can restrict 
our considerations to points in general position since the event “X,, X,, X;, X, 
are not in general position” has probability zero. If conv(X,, X,, X3, X,) is a 
triangle, one of the four points must lie inside the convex hull of the other three. 
This interior point is uniquely determined. For j = 1,...,4, let p,; denote the 
probability of the event “conv(.X,, X,, X3, X,) is a triangle and X, is the interior 
point.” By symmetry, note that p, = p, = p; = py. Then the probability py defined 
in (2) satisfies p, =p, +P. +p3 + Pa = 4). 

For fixed X,, X,, X, € K the probability of the event “X, € conv(X,, X,, X;)” 
(X, chosen randomly in K) is given by area(a X,X,X,)/area(K). Since 
X,, X,, X, © K are also chosen randomly we have to integrate over all possible 
triples (X,, X,, X;) @KxXKXK 


1 area( A X,X,X;3) 


= ——--___ aX,dxX,dX., 
area(K)° area( K) meee? 


KXKXK 


and consequently, 


dX, dX, dX. (3) 


4 | area( A X,X,X;3) 
Pr= Te 
KXKXK 


area(K)° area( K ) 
In a second approach to p,, we again assume X,, X,, X, © K to be fixed and 
ask for the probability of “conv X,, X,, X3, X,) is a triangle” if X, is chosen 
randomly in K. This happens exactly if X, lies inside the shaded domain of 
Figure 2, which is A X,X,X, U OX,X,X;,. Thus this probability is given by 
area( A X,X,X,) + area(OX,X,X;) 
area( K ) 


Figure 2 
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We again obtain p, by integrating this probability over all possible choices 
(X,, X,,X,) Kx Kx K: 


1 area( A X,X,X,) + area(OX,X,X 
>. = . ( 1X, X3) (OX, X, X3) dX dX, dX,. 
area(K) Jv eyr area( K ) 
(4) 
Comparison of (3) and (4) yields the statement of the proposition. a 


Dropping the condition area(K ) < © (which is guaranteed by the boundedness 
of K) leads to serious difficulties. For example, the probability that the convex hull 
of four randomly and independently chosen points in the whole plane is a triangle 
is not well-defined. 

Obviously, p , and Sylvester’s probability p, mentioned in the introduction are 
probabilities of complementary events, hence py = 1 — p,. Some further remarks 
about Sylvester’s probability are in order. Invariance of p, under affine transfor- 
mations of K implies that its value is the same for all triangles as well as for all 
ellipses. For these cases p, is known [7, pp. 44-46]: 


2 ; ; ; 
3 if K is a triangle 


if K is an ellipse. 


W. Blaschke [1, §24, §25] gave the first rigorous proof that the triangle and the 
ellipse are the extremal cases for p, and that 2/3 <p, < 1 — 35/127’ for any 
bounded convex domain K. Blaschke’s proof is surely very appealing to readers 
interested in geometric ideas. 

I would like to mention two generalizations of Sylvester’s problem. J. F. C. 
Kingman investigated the higher-dimensional analogue of Sylvester’s problem and 
solved it for the case K = B” (see [5, Theorem 7]). The extremality property of 
ellipses for p, was generalized to higher dimensions by H. Groemer [3]: 


Theorem 2. (Kingman, Groemer) The probability p,_,, that the convex hull of n + 2 
randomly chosen points in the unit ball B” C R” has n + 1 vertices is given by 


nt+1 n+1 (n +1) 
Pron =(nt2)] n+1 (n+1) |2"]- (5) 
2 
2 


The corresponding probability for any bounded convex domain K C IR" is never greater 
than the value given in (5). 


The binomial coefficients in (5) are defined by 


| a | - (a +1) 

b (b+ 1)f(a-—b +1) 

if not both parameters a,b are integers. Since ['(1/2) = Vz we obtain Par 
35/127*. For n = 3 we obtain p, , = 9/143, which will be used in the next 
section. 


Another generalization of Sylvester’s problem was investigated much more 
recently by P. Valtr: What is the probability p(n) that n randomly chosen points in a 
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fixed bounded convex domain K ¢ R* are the vertices of a convex n-gon? Sylvester’s 
problem is then the special case n = 4. Valtr obtained explicit formulas for this 
more general problem if K is a parallelogram or a triangle [13, 14]: 


2"(3n — 3)! 
((n — 1)!)°(2n)! 


| 2 
F | if K is a parallelogram. 
n! 


if K is a triangle 


Px(n) = 


Interestingly, Valtr’s method is purely combinatorial and does not require any 
integration. I recommend [6], [11, pp. 63-65], and [15] for further information 
about this subject. 


3. ABOUT THE PROBABILITY OF AN INTERSECTION. This section gives an 
answer to the following question: Let us assume that five points X,, X,, X3, X4, X5 
are chosen independently and uniformly at random in the unit ball B°. The convex hull 
of X,, X,, X3 is generally a triangle A X,X,X, and the convex hull of X,, Xs is 
generally a straight line segment (X,,X;). What is the probability of the event 
“A X,X, X30 (X,, X5) # ’? 

The following simple argument shows that this probability must be less than 
1/2. Let X,, X,, X; € B°’ be randomly chosen points in general position. Since 
A X,X,X, C aff(X,, X,, X;) the probability of “ A X,X,X,; 0 (X,, X;) # Wis at 
most the probability of the event “aff(X,, X,, X,) N (X,, X;) # 0.” The plane 
aff(X,, X,, X;) cuts the ball B° into two disjoint subsets S,,S,. Let p and q 
denote the probabilities that a randomly chosen point Y € B° lies in S, or S,, 
respectively. Assuming the plane affCX,, X,,.X,) to be chosen and fixed, 2pq is 
the probability that the line segment (X,, X,) of two randomly chosen points 
X,, X; € B° intersects this plane (this is the event “X, and X, fall into different 
subsets $, and S,”). From p + g = 1 we conclude that 2pq < 1/2. By averaging 
over all choices of X,, X,, X, we obtain a final probability < 1/2, but we shall 
see that this upper bound is very crude. 

To determine the exact probability, we first investigate the possible shapes of 
the convex hull of five randomly chosen points X,, X,, X3, X,, X, € R° in general 
position. It turns out that conv(X,,..., Xs) is either a double pyramid over a 
triangle with tips in different directions, or it is a tetrahedron. This statement 
seems obvious but the author doesn’t know a three-line proof. However, it is an 
easy consequence of the following basic theorem in convex geometry: 


Theorem 3. (Radoh). If V = {X,, X,,..., X,,5} is a set of n + 2 points of R” in 
general position then there exists a partition of these points into two sets V,,V, with 
V, UV, ={X,..., X42} and V, OV, = 0 such that the convex hulls conv(V,) and 
conv(V,) intersect in exactly one point. This “Radon partition” V,,V is unique in the 
sense that, for any other partition, the intersection of the corresponding convex hulls is 
empty. Here we identify partitions that are obtained by interchanging V, and V,. 


Radon’s theorem can be proved very elegantly with methods of linear algebra 
(see, e.g., [8, pp. 22-24] or exercise 6.0 in [16]). The algebraic methods, however, 
don’t illustrate the geometric intuition of the theorem. Readers interested in a 
geometric proof should look into [10] and [9]. A thorough treatment of the 
historical background of the theorem and related results can be found in [2]. 
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Xf 
X, 
x, 
x} 
Figure 3 
Corollary 4. Assume that there are five points X,, X,,...,X5 € R° in general 


position whose convex hull has five vertices. Then one can always choose a partition 
V, ={ X;, X;, X,} and V, = {X), X,,} such that 


AX,X,X,.0 (X), Xn) #O. 
This partition is unique in the sense of Theorem 3. 


Proof: Since the convex hull P has five vertices, none of the points X,,..., X, lies 
inside the convex hull of the other four. Hence the unique partition V,,V, in 
Theorem 3 must consist of 3 and 2 points. = 


Figure 3 illustrates a case in which the convex hull of X,,...,X,; has five 
vertices. To confirm the uniqueness of the partition, observe that the triangle and 
the segment defined by the partition V; = {X), X;, X,,}, V. = {X;, X,} don’t inter- 
sect. 

If X,,..., Xs are in general position and P := convCX,,..., X;} is a tetrahe- 
dron, the unique Radon partition is 


V, := {the four vertices of P }, 
V, = {remaining fifth point in the interior of P}, 


and any other partition yields an empty intersection. Consequently, the intersec- 
tion of a triangle spanned by any three of the points X,,...,X; with the line 
segment spanned by the remaining two points is always empty. 

With this information in hand we are able to calculate the probability of the 
event “A X,X,X, 0 (X,, X;) # 0” for randomly chosen points X,,..., X; € B? 
as 

xX X;) dX dX. 7 
= ——_~ m eee vt 
P32 vol( B?): i eR 32( Xy 5) aX, 5 (7) 


where the function in the integrand is 


1 if AX,X,X,9 (X,,X;) #0 


M3)(X,, X,,X3, X4, Xs) = 
Q otherwise. 
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Let us denote by .~™ the group of permutations of the numbers 1, 2,3, 4,5. For 
any five points X,,..., X,; in general position, we shall show later that Corollary 4 
ensures that 


> M(X,,, Xe.) X,,° Xo, X,,) — 


aE SF, 


12 if conv(.X,,..., X;) has 5 vertices 
0 otherwise. 


In the right-hand side of (7) we can permute the variables X,, X,,..., X; without 
changing its value. If we rewrite (7) in the following more complicated way 


1 1 


Py» = vol B) 3! xX i - pital Koy ++ Xo) dX,» dXs, 
we are able to apply (8) and obtain 
12 1 
Py = I vol By dos oc gks( Mire Xs) AX oo dks, (9) 
where k, is 1, if conv.X,,..., X;) has 5 vertices, and 0, otherwise. The normalized 


integral in (9) is the probability that the convex hull of 5 randomly chosen points in 
B° has 5 vertices, which is 1 — p, 3, and Theorem 2 ensures that p, , = 9/143. 
Consequently, the answer to the question posed at the beginning of this section is: 


Proposition 5. Assume that X,,..., X; are chosen randomly in the unit ball B’ c R?. 
The probability of the event “ A X,X,X,; 0 (X,, Xs) # ” is 
_— 0.0937 
P32 — 10 143 ~ . eee 


The corresponding probability for any bounded convex set K € R° is no less than py. 


It remains to prove (8): If X,,..., Xs are in general position, convLX,,..., X.) 
has either 5 or 4 vertices. In the latter case, m3,(X,,...,X,,) =0 for any 
permutation o ©.%. If conv(X, ,..., X,,) has 5 vertices we conclude from Corol- 
lary 4 that 


My(X,,,.-.,X,,) = 1 


if and only if {o,, 05, 03} = {i, j,k} and {o,, o;} = {1, m}. This is true for exactly 
312! = 12 of all the permutations. = 


4. FINAL REMARKS. Proposition 1 can be generalized to higher dimensions. For 
example, in 3 dimensions any four points X,,..., X, © K in general position span 
a simplex S$. The planes containing the two dimensional faces of S cut the domain 
K into 15 subsets. In this case there exists a connection between the area of the 
simplex spanned by X,,..., X, and the total area of the four subsets that touch 
the simplex in exactly one vertex. 

I would like to mention two problems related to the considerations of Section 3. 
The first problem is an easy exercise whereas the second might be a difficult 
research problem. Let p,, denote the probability that conv(X,,..., X,) and 
conv(Y,,..., Y)) intersect where X,,..., X, and Y,,...,Y, are randomly chosen 
points in the unit ball B**’~?. 


Problem 1. Determine the probability p,, that two randomly chosen segments in 
B? intersect. Check your result by running a Monte Carlo test. 
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Problem 2. It is possible to calculate the probabilities p,, = p,, for all k = 2, p, 
and p,,; = Px by combining Kingman’s and Radon’s theorem. This simple method 
fails to work for all the other probabilities p,,. Find new methods to calculate 
some of the other probabilities. 


Last, but not least, I want to express my gratitude to Edson de Faria, Reinhard 
Hermann, Richard Isaac, Lucian Man, Mechthild Stoer, Pavel Valtr, and Gunther 
M. Ziegler for many constructive comments. Luiz Magalhaes made helpful sugges- 
tions for improving the figures in this article. I am also thankful to the Deutsche 
Forschungsgemeinschaft (DFG) for its financial support. 
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Newman’s Short Proof of the Prime 
Number Theorem 


D. Zagier 


Dedicated to the Prime Number Theorem on the occasion of its 100th birthday 


The prime number theorem, that the number of primes <x is asymptotic to 
x/log x, was proved (independently) by Hadamard and de la Vallée Poussin in 
1896. Their proof had two elements: showing that Riemann’s zeta function ¢(s) 
has no zeros with ‘i(s) = 1, and deducing the prime number theorem from this. 
An ingenious short proof of the first assertion was found soon afterwards by the 
same authors and by Mertens and is reproduced here, but the deduction of the 
prime number theorem continued to involve difficult analysis. A proof that was 
elementary in a technical sense—it avoided the use of complex analysis—was 
found in 1949 by Selberg and Erdos, but this proof is very intricate and much less 
clearly motivated than the analytic one. A few years ago, however, D. J. Newman 
found a very simple version of the Tauberian argument needed for an analytic 
proof of the prime number theorem. We describe the resulting proof, which has a 
beautifully simple structure and uses hardly anything beyond Cauchy’s theorem. 

Recall that the notation f(x) ~ g(x) (“f and g are asymptotically equal”) 
means that lim, _,..f(«)/g(x) = 1, and that O(f) denotes a quantity bounded in 
absolute value by a fixed multiple of f. We denote by w(x) the number of primes 
<X. 


Xx 
Prime Number Theorem. 77(x) ~ ex BS > 00, 
og x 


We present the argument in a series of steps. Specifically, we prove a sequence 
of properties of the three functions 


ae | lo 
f(s) = Le o(s)=¥. _ d(x) = Vlogp (s€C, xER): 


p pe<x 


we always use p to denote a prime. The series defining ¢£(s) (the Riemann zeta- 
function) and ®(s) are easily seen to be absolutely and locally uniformly conver- 
gent for {t(s) > 1; so they define holomorphic functions in that domain. 


). £(s) =11,0 —p™*)' for Ks) > 1. 


Proof: From unique factorization and the absolute convergence of ¢(s) we have 


. 1 
y= Lars) = [Lee }=TM pas (os) > 0). 


r2,13,°°° =O P ‘r>0 
1 
QD. f(s) - —< extends holomorphically to ¥t(s) > 0. 
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Proof: For %t(s) > 1 we have 


1 eo 1 00 - entif 1 1 
- T= —~-| x~a&= ——-— 7] a. 
()- qr Lail pert | — =| 
The series on the right converges absolutely for {i(s) > 0 because 
n+1 n+1 sr% u S [s| 
J [ 7 =] ar| = sf Ia < n<ucntl ystt | Ro) 


by the mean value theorem. 
(HD. &(x) = O(x). 


Proof: For n © N we have 
n 2n 2n 2n _ 
Den =(1+ 1 2 — | 4 ss +| | > | | > _ etn) O(n) 
( 0 2n n JL 


and hence, since 3(x) changes by O(log x) if x changes by O(1), 0(x) — 3(x/2) 
< Cx for any C > log2 and all x >x, =x,(C). Summing this over 
x,x/2,...,x/2", where x/2" > x, >x/2"*', we obtain (x) < 2Cx + O(J). 


(IV). ¢(s) #0 and ®(s) — 1/(s — 1) is holomorphic for ¥i(s) = 1. 


Proof: For ‘k(s) > 1, the convergent product in (I) implies that ¢(s) # 0 and that 
c'(s) log p log p 
7 — » S — D(s) + » s S ° 
f(s) Fp-i1 > P(p*- |) 


The final sum converges for $t(s) > 5, so this and (ID) imply that ®(s) extends 
meromorphically to R(s) > 4, with poles only at s = 1 and at the zeros of £(s), 
and that, if £(s) has a zero of order uw at s = 1+ia(a€R, a # 0) and a zero of 
order v at 1 + 2ia(so p, v => O by (II), then 


lime ®(1+ €) = 1, lime®(1+ etia) = —p, and lime ®(1+€+2ia) = —»v. 
eNO ev 0 ev 0 
The inequality 


~( 4 | log Pag 
© [24 ,JOU + et ira) =) pits (pie’ + p fa/2)" = 0 
r= P 


then implies that 6 8u— 2v>0,so w=0,ie., (1 +ia) # 0. 


a 


(V). f 


* dx is a convergent integral. 


Proof: For ‘i(s) > 1 we have 


logp — _-~dd(x) _, “2) 
p’ = J - J xs* 


®(s)=¥ dx = sf e~*'9(e') dt. 


Therefore (V) is obtained by applying the following theorem to the two functions 
f(t) = ee — 1 and g(z) = ®(z 4+ 1)/(z + 1) — 1/2, which satisfy its hy- 
potheses by (IID and (IV). 


706 NEWMAN’S SHORT PROOF OF THE PRIME NUMBER THEOREM [October 


Analytic Theorem. Let f(t) (t > 0) be a bounded and locally integrable function and 
suppose that the function g(z) = [ fie 7! dt CR(z) > 0) extends holomorphically 


to Kt(z) = 0. Then [ f(t) dt exists (and equals g(0)). 


(VI. d(x) ~ x. 


Proof: Assume that for some A > 1 there are arbitrarily large x with &(x) => Ax. 
Since J is nes Y we have 


AX “On ax A ~ 
J =f 


for such x, contradicting (V). Similan, the inequality d(x) < Ax with A<1 
would imply 


aU>0 


x 0 (ft) 


[as f= 


again a contradiction for A fixed and x big enough. 
The prime number theorem follows easily from (VD, since for any € > 0 


O(x) = Vi logp < Yo log x = r(x) log x, 


Ax - 


at = [= a a K<o, 


p<x p<x 
O(x)=> YL logp=> YO (1-e)logx 
x!"€<p<x x'"€<p<x 


(1 — €) log x[ a(x) + O(x'*)]. 


Proof of the Analytic Theorem. For T> 0 set g,(z) = i "f(t)e~7! dt. This is 
0 


clearly holomorphic for all z. We must show that lim, _,..g7(0) = g(0). 

Let R be large and let C be the boundary of the region {z € C| |z| < R, 
i(z) => —6}, where 6> 0 is small enough (depending on R) so that g(z) is 
holomorphic in and on C. Then 


2 


“a 
I+ aa}5 


dz 


1 
g(0) — g7(0) = sai J kee) — g,(z))e7? 


by Cauchy’s theorem. On the semicircle C,= CM {(z) > 0} the integrand is 
bounded by 2B/R’, where B = max,, ol f(z)|, because 


Be *@T 
— — —Zzt zt 
lg(z) g,(z)|= if f(t)e*' dt <Bf le-*"|dt = H(z) ) (H(z) > 0) 
and 
2 
e2t —_ t = enor, 22) 
R R 


Hence the contribution to g(0) — g_(0) from the integral over C, is bounded in 
absolute value by B/R. For the integral over C_= C1 {R(z) < 0} we look at 
g(z) and g,(z) separately. Since g, is entire, the path of integration for the 
integral involving g, can be replaced by the semicircle Ci ={z<C||z|=R 
H(z) < O}, and the integral over C’_ is then bounded in absolute value by 27B/R 
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by exactly the same estimate as before since 
Be X(T 


[fest TR(Z) 


Finally, the remaining integral over C_ tends to 0 as T — ~ because the integrand 
is the product of the function g(z)(1 + z*/R*)/z, which is independent of T, and 
the function e7’, which goes to 0 rapidly and uniformly on compact sets as T > © 
in the half-plane ‘i(z) < 0. Hence limsup|g(0) — g-(O)| < 2B/R. Since R is 
arbitrary this proves the theorem. Pe 


lg,(z)| = < Bf” le*ld= (I(z) <0). 


Historical remarks. The “Riemann” zeta function ¢(s) was first introduced and 
studied by Euler, and the product representation given in (I) is his. The connection 
with the prime number theorem was found by Riemann, who made a deep study of 
the analytic properties of ¢(s). However, for our purposes the nearly trivial 
analytic continuation property (II) is sufficient. The extremely ingenious proof in 
(III) is in essence due to Chebyshev, who used more refined versions of such 
arguments to prove that the ratio of 0(x) to x (and hence also of w(x) to x/log x) 
lies between 0.92 and 1.11 for x sufficiently large. This remained the best result 
until the prime number theorem was proved in 1896 by de la Vallée Poussin and 
Hadamard. Their proofs were long and intricate. (A simplified modern presenta- 
tion is given on pages 41-47 of Titchmarsh’s book on the Riemann zeta function 
[T].) The very simple proof reproduced in (IV) of the non-vanishing of ¢(s) on the 
line i(s) = 1 was given in essence by Hadamard (the proof of this fact in de la 
Vallée Poussin’s first paper had been about 25 pages long) and then refined by de 
la Vallée Poussin and by Mertens, the version given by the former being particu- 
larly elegant. The Analytic Theorem and its use to prove the prime number 
theorem as explained in steps (V) and (VI) above are due to D. J. Newman. Apart 
from a few minor simplifications, the exposition here follows that in Newman’s 
original paper [N] and in the expository paper [K] by J. Korevaar. 

We refer the reader to P. Bateman and H. Diamond’s survey article [B] for a 
beautiful historical perspective on the prime number theorem. 
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An Exploratory Approach to Kaplansky’s 
Lemma Leads to a Generalized Resultant 


David Callan 


Let T: V ~V be a linear transformation on a vector space V. If the powers 
I,T,T’,...,T* of T are linearly dependent, then surely for every v € V, the 
vectors v, Tv, T7*v,..., Tv are also linearly dependent. Kaplansky noted that the 
converse assertion is also true. 


Theorem 1. (Kaplansky’s Lemma in finite dimensions). Suppose T: V > V is a 
linear transformation on a finite-dimensional vector space V over an arbitrary field F 
and suppose the set of powers {T'}*_, is linearly independent. Then there exists v © V 
such that the set {T'v}‘_, is linearly independent. 


This result actually holds in arbitrary dimensions and the extension from finite 
to infinite dimensions is fairly straightforward [6, p. 64]. Aupetit [1] gives an 
elegant proof when F = C (valid for any dimensions) using Liouville’s theorem 
from complex analysis. His proof is reproduced, slightly polished, in Prasolov’s 
recent intriguing survey of linear algebra emphasizing modern proofs of classical 
results [5, p. 46]. 

Henceforth, we assume a finite-dimensional space, so T has a minimal polyno- 
mial of degree n, say. The maximum k can be is n — 1 and so we may as well 
assume k =n — 1. Prasolov then gives a slick proof, but only for infinite fields, 
using minimal vector-annihilating polynomials and their subspaces [5, 13.1.2, p. 73]. 
It relies on the result: a vector space cannot be expressed as a finite set-theoretic 
union of proper subspaces, and this of course requires the underlying field to be 
infinite. For arbitrary fields, one can readily construct a suitable vector v using the 
theory of the rational canonical form [4, p. 198]. This says that T acts as a block 
diagonal matrix, the diagonal blocks being the companion matrices of the invariant 
factors d,(x)|d,(x)|...ld,(x) of T. The vector v consisting of 0’s except for a 1 in 
the first position corresponding to the d,(x) block has minimal annihilating 
polynomial d,(x). This vector v fits the bill since d,(x) is also the minimal 
polynomial of, 7. 

But suppose out of curiosity we look for a suitable v using the version of the 
rational canonical form involving elementary divisors [3, p. 262]. Now T acts as a 
block diagonal matrix whose diagonal blocks are the companion matrices of the 
elementary divisors { p,(x)°*} of T. Here the polynomials p,(x) are irreducible and 
distinct. Let f(x) = pfx)", 1 <is<m, be the elementary divisors of highest 
degree, so that [[7, f(x) is the minimal polynomial of T with degree n = 

m ,deg f,. In fact, it is sufficient to assume these are the only elementary divisors; 
after dealing with this case, v can be augmented with zeros to cover the general 
case. 

Thus we assume T is the n-square matrix diag(A,, A,,..., A,,) where A; is the 
companion matrix of f,, 1 <i<m. Let (vj v5 -*: v))’ be the corresponding 
partition of our desired (column) vector v. For v,, let’s try the simplest possible 
nonzero vector: just One nonzero entry—a 1, say—which should be the first entry 
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to maximize linear independence among {A‘u,}.. 9. This simple scheme would 
work if only we had a guarantee that the resulting n-square matrix B := 
(v Tv :-- T"~‘v) has linearly independent columns—equivalently, det B # 0— 
based solely on the premise of pairwise relative primeness of the polynomials f,; 
from which T is formed. But for polynomials, relative primeness means no 
common root (in an algebraic closure of F') and so a simple numerical quantity 
guaranteed to be nonzero precisely when the f, are pairwise relatively prime is the 
product of all differences of roots of different f,’s. Call this product A. We are thus 
emboldened to conjecture that A and det B are closely related. In fact, as we show 
in Theorem 2, they are equal (up to sign). 

To state this result in appropriate generality, let A;,, 1 <i<m,1<j <n, be 
algebraically independent commuting indeterminates over F (the “roots’”) and for 
1<i<m let o,, 1 <j <n,, denote the elementary symmetric polynomials in 
{A, v2, (the “coefficients”). Then let f(x) = x" — ox"! + ojx"~? 
— ++ +(-1)"o;,, 1 <i <™m, so that the roots of f; are {A;}7). Set n = Li nj. 
The companion matrix of f, is 

0 (-1)""‘o, 


in, 


and the characteristic polynomial of A, is f;. Now let A be the n-square block 
diagonal matrix with A, as its ith diagonal block, 1 <i <™m, let e, denote the 
n,-length (column) vector with 1 in the first position and 0’s elsewhere, and let e be 
the n-vector obtained by concatenating the e,. Finally, let B denote the n-square 
matrix whose jth column is A’~'e, 1 <j <n. For example, when m = 3 and 
nN, =3,n,=2,n, = 1, 


2 

1 0 0 0743 031973 044,933 — 9129743 

_ _ _ 2 2 
0 1 0 0712 011912 + O43 O71 F12 + 011013 + O72 

2 3 

0 0 1 O11 O11 ~ F%12 O71 — 2041042 + 043 
1 0 —_ 022 —~ 841,922 — 094072 + 049 — 07409. + 20314033 

2 3 4 2 2 
0 1 O71 O71 Onn 921 2021072 O71 307,022 + O55 

2 3 4 5 

1 0731 O31 O31 O31 O31 


Theorem 2. For the matrix B formed from the i,; through their symmetric functions 


g;; as in (1), 


dtB= [] 7 r] Aj — Aix) (2) 


1<i<xj<m k=1 I=1 


Before proving Theorem 2 (and thereby also Kaplansky’s Lemma) we make 
some remarks. The classical resultant of two polynomials expresses the right side 
of (2) (for m = 2) as the determinant of the Sylvester matrix [7, p. 102] also formed 
from the polynomials’ coefficients. Thus Theorem 2 gives a form of generalized 
resultant, though the matrix B does not coincide with the Sylvester matrix for 
m = 2. Further information about resultants (including a generalization in a 


710 AN EXPLORATORY APPROACH TO KAPLANSKY’S LEMMA [October 


different direction) and their application in finding greatest common divisors of 
polynomials is given in [2, §2.1]. 

What does the matrix B look like when expressed in terms of the A’s? The 
following two propositions give some useful information. 


Proposition 1. All nonzero terms in the determinant of any submatrix of B have the 
same total degree in the \’s (depending only on the submatrix). 


Proof: The matrix B contains various size identity matrices flush left; the other 
entries are polynomials in the o’s that are easily seen to be homogeneous 
polynomials in the A’s whose (total) degrees are as illustrated in Figure 1 (for 
m = 3; n, = 3, n, = 2, n, = 1). Here the blanks represent zero entries in B. 


O — — 3 4 5 0 12 3 4 5 
—- 0 — 2 3 4 —] O 1 2 3 4 
—- — O 1 2 3 —2 -1 0 1 2 3 

O — 2 3 4 5 0 12 3 4 5 
— 0 1 2 3 4 —1 O 1 2 3 4 

0 1 2 3 4 °5 0 12 3 4 5 

Figure 1 Figure 2 


It is possible to fill in the blanks so that each row is a sequence of consecutive 
integers, yielding a matrix C as in Figure 2. The degree of the product of given 
nonzero entries in B is the sum of the corresponding entries in C. Thus it certainly 
suffices to show: for any square submatrix D of C, the sum of the entries in a 
diagonal is constant over all diagonals of D (a diagonal is a maximal set of entries, 
no two in the same row or column). Now the entries in C are of the form 
c;; = d,; + e, for some sequences {d;}, {e;} (e; may be taken =). Thus the sum of 
the entries in any diagonal of D is the sum of the d, corresponding to the rows of 
D and the e; corresponding to the columns of D, independent of the diagonal. 


Proposition 2. Suppose B is partitioned into m* submatrices B, ; Of size n,-by-n, in the 
obvious way. Fix i in [1, m]. Then 


(i) B;, _ Antn2t tee es 
(ii) det By = Cay Ajo ot Aig 2, 
(iii) The degree (in the X’s) of each term in det B;, isnf{n, +n, +++ +n;_,), 
(iv) If B: is any n,-square submatrix of B involving the same rows as B,, but no 
column to the right of the rightmost column of B,,, then the degree of each term 
in det, B: is strictly less thann{n, + n, + ++: +n;_,) unless B; = B,,, 
(v) The degree of each term in det B is Uy <j <j <mMil;: 


Proof: 


(i) If an n,-square window slides from left to right along the ith row of the 
partitioned B, the successive powers J, A,, A?, ... are obtained. 
(ii) This is clear since det A; = 9;,. 
Gii) follows from Gi) and Proposition 1. 
(iv) follows from (ii) and Proposition 1 since degrees are strictly increasing 
along each row of B. 
(v) also follows from (iii) and Proposition 1. a 


Now to the proof of Theorem 2 in three steps: a) the right side of (2), A divides 
det B, b) A and det B have the same degree and hence a) implies they can differ 
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only by a scalar factor, c) A and det B contain an identical term and hence 
this scalar factor is 1. At this point only step a) requires some work since the 
degree of A is Ly cjcj<mNjM;, aS is the degree of det B by Proposition 2(v). 
Also, the product of the first terms of the factors comprising A is X := 
(Ag, tt Aan) CAg, tt Ag J On tt Ann, J Te (and this mono- 
mial does not occur elsewhere in A). On the other hand, by Proposition 2 (iv) and 
a Laplace expansion of det B, only the entries in the main diagonal blocks 
Bi, By,+++5 By Cam produce the monomial X and their contribution is precisely 
det B,, det B,. «det B,,,, = X by Proposition 2 (ii). 

It remains to establish a). Clearly, det B lies in the polynomial ring 
FLAy,, +++ Amn, |. First, we show by a standard technique that each factor in A isa 
divisor of det B in this ring. To do so, suppose A;, = A,;, for some i, j,k, / with 
i # j. This means that A; and A; share a common eigenvalue A;, = Aj, = A. Let u; 


and u, be left (row) eigenvectors of A; and A,, respectively, corresponding to A. 
Then for all r > 0, A. 0 
(u,; Us; ) | 0 A, 


We claim the eigenvector u, can be scaled so that the inner product 


(u; wfc = 


Since only the first components of e;, e; are nonzero, it clearly suffices to show that 
both inner products u;e; and u,e; are nonzero. But if, say, u;e; = 0 then for all 
r> 0, u;A%e; = A’u,e; = 0 while {A'e} — spans all of F” (its first n, vectors are 
the standard basis for F”:). Thus u; © F” would be perpendicular to F”, hence 0, 
and eigenvectors are not allowed to be 0. Thus the claim is established. It follows 


that A, 0 
0 A, 


J 


r 


= N"(u, U,;). 


(u; u;) 


In other words, (u; u;) provides a nontrivial, yet vanishing, linear combination of 
the rows of B indexed by its z and j blocks. Thus , is singular and det B = 0. 
Hence A,,— A, divides det B in the ring F[Ay),..., Amn», |. Since each Aj, — Aj, is 
prime in ‘his ring, their product A also divides det B. Thus Theorem 2 holds and 


e. 
| =0  forallr>0. 
ej 


our generalized resultant yields yet another proof of Kaplansky’s Lemma. = 
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Metric Spaces in Which All Triangles 
Are Degenerate 


Bettina Richmond and Thomas Richmond 


In any subspace of the real line R with the usual Euclidean metric d(x, y) = |x — yl, 
every triangle is degenerate. In R* or R° with the usual Euclidean metrics, a 
triangle is degenerate if and only if its vertices are collinear. With our intuition of a 
degenerate triangle having “collinear vertices” extended to arbitrary metric spaces, 
we might expect that a metric space in which every triangle is degenerate must be 
“linear”. It might be reasonable to expect that any “linear” metric space is 
isometric to a subset of R with the usual metric. When classifying all metric spaces 
that have only degenerate triangles, we find that there are such metric spaces 
other than (isometric images of) subspaces of R. These other spaces, however, all 
have precisely four points and all are of the same form. In the final section, we 
illustrate that the usual topology on R* can not be generated by any metric in 
which all triangles are degenerate. 

A metric space (M, p) is a set of points M with a metric, or distance function, 
p: M x M > (0, ©) that satisfies some natural properties we expect of distances: 


p(x,y) = p(y,x) forany x,y <M, 
p(x,y) =0 ifandonlyifx=y, and 
p(x,y) + p(y,z) = p(x,z) forany x, y, z € M (triangle inequality). 


If (M, p) and (N, 6) are two metric spaces, a function f: M—>WN such that 
p(x, y) = 6(f(x), f(y) for any x,y € M is called an isometry. Isometries are 
always one-to-one. The metric spaces (M, p) and (N, 6) are isometric if there 
exists an isometry from M onto N. Generally, one does not distinguish between 
isometric metric spaces. 

If (M, p) is a given metric space, we say that {x,, x,, x,} GC M forms a degenerate 
triangle if p(x;,x,) + p(x;, x,) = p(x;,x,) for some permutation i,j,k of the 
indices {1,2,3}. Thus, {x,,x,, x3} forms a degenerate triangle if the triangle 
inequality is an equality for some permutation of the points x,, x,, and x3. 

We want to-study metric spaces in which every triangle is degenerate, that 1s, 
metric spaces such that every 3-element subset forms a degenerate triangle. For 
brevity, such a metric space is called a degenerate space, and its metric is called a 
degenerate metric. Triangles with fewer than three distinct vertices are clearly 
degenerate, and in what follows, we assume that all triangles have three distinct 
vertices, unless otherwise noted. With this understanding, we can make statements 
such as “A four-point space has exactly four distinct triangles.” The real line with 
the usual Euclidean metric d(x, y) = |x — y| is denoted by R. It is a familiar fact 
that R is a degenerate space. Any subspace of a degenerate space is a degenerate 
space. Given any degenerate space M, one might suspect that it must be isomor- 
phic to a subspace of R, and might attempt to construct an isometry from MM into 
R. As we will see, the construction of such an isometry requires that if {y, 0, p} and 
{o, p, x} are (degenerate) triangles in M with longest sides {y, p} and {o, x} 
respectively, then triangle {y, o, x} must have longest side {y, x}. Surprisingly, this 
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need not be true. Thus, the structure of a degenerate space is based upon the 
structure of its 4-point subspaces. We consider 4-point degenerate spaces in the 
next section; in Section 2 we find that the problematic 4-point subspaces cannot 
occur if the space has more than 4 points. 


1. FOUR-POINT DEGENERATE SPACES ARE OF TWO FORMS. Throughout 
this section, X = {S, T, U,V} denotes a 4-point degenerate space. The four points 
of X are represented by the vertices and center of a (Euclidean) equilateral 
triangle as shown in Figure 1. The distances between points are denoted by labels 
on the corresponding edges of this graph. Often we do not distinguish between an 
edge and its length. By the longest edge of some set, we mean any edge of maximal 
length from that set. After a permutation of the vertices of X, we may assume that 
the edge {V,T} is maximal among the six edges, and that {V, S} is maximal among 
the remaining four edges having V or T as a vertex. If we let p(V,S) =a, 
p(S,T) = b, p(U,V) = c, and p(U, T) = d, then since {V, T} is the longest edge in 
the triangles {V,7,S} and {V,7,U}, we must have p(V,T) =a +b=c+d, as 
shown in Figure 2. By the choice of the edge {V, S}, we know that a > c. Hence 
either {V, S} or {S, U} is the longest edge of the triangle {V, S, U}, and thus either 
p(S,U) =a —c (Case 1) or p(S,U) =a + c (Case 2). Depending on which edge 
of triangle {.$,7,U} is the longest, either p(S,U) = b + d (Case A), p(S,U) = b 
— d (Case B), or p(S,U) = d — b (Case C). Among the 6 cases that result from 
considering which are the longest edges of these two triangles, the cases 1A, 1B, 
2B, and 2C are easily seen to be impossible. For example, in case 1B, the equations 
a-—c=b-—dandat+b=c+d yield a —c=0, contrary to S and U being 
distinct points with p(S,U) = a — c. In the case 1C, the equations a — c = d — b 
and a +b =c +d are equivalent, while in the case 2A, the equations a +c =b 
+dand a+b=c+d yield a=d and b=c. The two configurations corre- 
sponding to these two cases 1C and 2A are pictured in Figure 3. Both of these are 
possible, and any 4-point degenerate space must be of one of these two forms. 


U 


V at+b=c+d T 
Figure 1 Figure 2 


There are familiar models for both of these 4-point metric spaces. The form 
from Figure 3a can be achieved using Euclidean distance on the real line. If 
positive distances a,b, and c <a have been chosen, then letting V be any real 
number, T7=V+a+b,S =V+a,and U=V +c gives a 4-point subspace of R 
in which the Euclidean distances between the points agree with those in the form. 
Because of this model, we call this configuration, shown in Figure 4, the linear 
form of a 4-point degenerate space. The other 4-point degenerate space, shown in 
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Figure 4, Linear Form. 


Figure 3b can be redrawn as a square with its diagonals, as shown in Figure Sa. It 
should be clear that no such configuration can be realized with positive Euclidean 
distances, but the square suggests a familiar model, shown in Figure 5b. Let {S, U} 
and {T,V} be distinct pairs of diametrically opposite points on a circle with 
circumference 2a + 2b such that the points V and S determine a central angle of 
qa/(a +b) radians. Let the distance between two points on the circle be the 
length of the shortest arc of the circle connecting the two points. This familiar 
arc-length metric restricted to the set {S,7,U,V} is a realization of the 4-point 
degenerate space of Figure 3b. Because of this model, we call this configuration 
the circular form of a 4-point degenerate space. 


U b Vv 
fe) O 
a+b 
a a 
a+b 
O O 
T b S 
T 
Figure 5a Figure 5b 


Figure 5, Circular Form. 
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2. FIVE-POINT DEGENERATE SPACES ARE OF LINEAR FORM. The structure 
of any degenerate space is based on its 4-point degenerate subspaces, which must 
be of linear or circular form. In this section, we show that in any 5-point 
degenerate space all 4-point subspaces are of linear form. Thus, the problem 
encountered in defining an isometry from any degenerate metric space M into R 
can only arise in 4-point spaces (of circular form). 

Before considering general 5-point degenerate spaces, we make a few observa- 
tions on the specific metrics presented in Section 1 as models of 4-point degener- 
ate spaces. It is clear that extending our Euclidean model of a 4-point degenerate 
space of linear form to any larger subset of R results in a degenerate space. 
However, a degenerate space {S, 7, U,V} of circular form cannot be extended to 
any larger degenerate space consisting of points on a circle with the arc-length 
metric. If R were another point on the circle, relabel the points of (S, 7, U,V} so 
that R, S,7,U, and V appear in that order along the circle (traced in any specified 
direction). Now R, T, and U do not lie in any semicircle, and thus, the sum of the 
distances from any one of these points to the other two is strictly greater than the 
arc-length of a semicircle. In our “length of shortest arc’ metric, however, no 
distance can exceed the length of a semicircle. Thus, the triangle formed by R, T, 
and U cannot be degenerate. 

We now show that there is no extension of the 4-point degenerate space of 
circular form to any 5-point degenerate metric space. We start with the 4-point 
circular form {S, 7, U,V} in its square representation and add a fifth point R, as 
shown in Figure 6, and assume that this space is degenerate. Two opposite edges 
of the square have lengths x and the other pair of opposite edges of the square 
have lengths y. Let a be the maximal length of the four edges having RK as a 
vertex. Without loss of generality, we may label the 5-point degenerate space as 
shown in Figure 7. 


©) O 
x+y 
R oO 
x+y 
© @ 
Figure 6 
U y v 
O © 
d 
x+y 
RO * 
x+y 
b 
@ © 
T y S 
Figure 7 


716 METRIC SPACES IN WHICH ALL TRIANGLES ARE DEGENERATE [October 


We first show that a<x+vyy, that is, x + y is the longest edge of any in 
Figure 7. Suppose to the contrary that a > x + y. Let j be the largest integer such 
that a >x + Jy and let k be the largest integer such that a > kx + y. Since a is 
the length of the longest edge in triangles {R,S,V} and {R,U,V}, we have 
a=x+cz=dt+y, and thus c > jy and d > kx. Adding these inequalities gives 
c+d>x+y,so the longest edge in triangle {R, S$, U} has length c or d. If c were 
the longest, then c=d+x+y>(k+1)x+y2a, contrary to a being the 
longest edge incident on R. Similarly, d cannot be the length of the longest edge 
of {R, S,U}. Thus,a<x+y. 

Now we show that the assumption that Figure 7 represents a degenerate space 
leads to the contradiction that the set of real numbers {x, y, a} has no maximum. 
By the preceding paragraph, the longest edges of triangles {R, S, U} and {R,T,V} 
have length x + y, and thus a+b=c+d=x+ yy. Suppose x = max{x, y, a}. 
The choice of a implies that x = max{x, y, a, b, c, d}. From triangles {R, S, V} and 
{R,T,U}, we have x =a+c=d+b.Sincea+b=c +d, it follows that a =d 
and b=c. Thus, x =a+c=a+t+b=x+y, giving the contradiction that y = 0. 
By the symmetry of x and y, the case y = max{x, y, a} is also impossible. Finally, 
suppose a = max{x, y, a}. From triangles {R, V, S} and {R, V, U}, we have a =c + 
x=d++y. It follows that2a =c +d+x+y=2(a + b), contrary to b # 0. 

Thus, the 4-point degenerate space of circular form cannot be extended to a 
5-point degenerate space, and therefore cannot be extended to any degenerate 
space with more than 4 points. 


3. CLASSIFICATION OF ALL DEGENERATE SPACES. Eliminating circular 
form subsets from spaces with 5 or more points clears the way for the classification 
of not only 5-point degenerate spaces, but of all degenerate spaces. 

It is clear that every metric space with fewer than three points is degenerate and 
isometric to a subspace of R. Any degenerate 3-point space consists of one 
degenerate triangle with edges of length, say, a,b, and a +b. Such a space is 
clearly isometric to the subspace {—a, 0, b} of R. Four-point degenerate spaces are 
of one of the two forms previously described. In any degenerate space with more 
than four points, every 4-point subspace is degenerate, of linear form. 

Suppose M is a degenerate space with more than four points. We will construct 
an isometry between M and a subspace of R. Pick any two distinct points o and p 
of M. The point o plays the roll of the “origin” and the point p plays the roll of a 
“positive point’. Define a function f: M — R by 


—p(o,x) if {x, p} is the longest edge of triangle {x, 0, p} 
I(x) p(o,x) if {x, o} or {o, p} is the longest edge of triangle {x, 0, p}. 

We first show that f is well defined. Observe that {x, p} and {x, o} cannot both 
be longest edges of triangle {x, 0, p}, for then p(x, p) = p(x, 0) + plo, p) and 
p(x, o) = p(x, p) + p(p, oa) lead to the contradiction that p(o, p) = 0. If {x, p} 
and {o, p} are both longest edges of triangle {x, 0, p}, then p(x, p) = p(x,o0) + 
plo, p) and plo, p) = plo, x) + p(x, p), and it follows that plo, x) =0 = 
— p(o, x), so that both definitions of the value of f(x) agree. 

We now show that f is an isometry. If x =y, then clearly p(x, y) = 0 = 
| f(x) — fiy)| = df), fy). If one of x or y is o, say y =o, then f(x) = 
+ plo, x), so p(x, o) =|f(x) — 0] = |f(x) — flo)| = d(x), fly). We may thus 
assume that x, y, and o are distinct. If {x, y, 0, p} is not already a 4-point subset of 
M it may be extended to a 4-point subset of M, which, after our unexpected 
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combinatorial detour, we now know must be of linear form as shown in Figure 4. 
The linear form of Figure 4 can be linearly ordered in two natural ways: left to 
right or right to left. Give {x, y,o, p} the natural linear order from its linear 
representation as in Figure 4 in which o < p. Without loss of generality, let us 
assume x <y in this order. There are three cases to consider: either x < y <0, 
x<o<y,oro <x <y.Ifx <y <o,then p(x, o) = p(x, y) + ply, 0), So p(x, y) = 
p(x, 0) — ply,o). Since x<o<p, f(x) = —plo,x), and similarly f(y) = 
—p(o, y). Thus, p(x, y) = f(y) — f(x). Since the distance from x to y is positive, 
we have p(x, y) =f(y) — f(x) =|f(x) —f)|. If x<o<y, then p(x, y) = 
p(x, 0) + plo, y). Since o <y, either {o, p} or {o, y} is the longest edge of 
{y, o, p}, so f(y) = plo, y). As before, x < o < p implies f(x) = — plo, x). Thus, 
p(x, y) = fly) — f(x) = |f(x) — fy). Finally, if o<x<y, then p(x, y) = 
plo, y) — plo, x) = fly) — f(x) = |f(x) — fly)|. Thus, for any choice of x, y € M, 
we have shown that p(x, y) =|f(x) — f(y)| = d(f(x), f(y). This completes the 
proof that any degenerate space with more than four points is isometric to a 
subspace of R. 

Note that if M is a 4-point degenerate space of circular form, the function f 
defined in the preceding paragraph is still well defined but is not an isometry. 


4. TOPOLOGICAL CONSIDERATIONS. Our motivation for considering degen- 
erate spaces originated with a topological question. Denote the topology generated 
by the Euclidean metric on R” by 7,. With the usual Euclidean metrics, R* has 
non-degenerate triangles but R does not. Does this alone provide another verifica- 
tion that (R, 7,) and (R’, r,) are not homeomorphic? 

There are two immediate observations we should make. First, metrizable 
topological spaces can be generated by several ‘different metrics which may not 
share common properties such as boundedness or “degenerateness”’. In particular, 
observe that though (R, 7,) is generated by a degenerate metric d(x, y) = |x — yl, 
it is also generated by non-degenerate metrics. If we embed R into R’ as the graph 
of y =x’ and give this parabola P = {(x, y) € R*: y = x’} the usual Euclidean 
distances from R’, then P has no degenerate triangles, yet P is homeomorphic to 
(R, 7,). Secondly, it is easily seen that “metrizable with a degenerate metric” is a 
topological property, for if (M, p) is a degenerate space homeomorphic to a 
topological space X, then a degenerate metric can be defined on X by making the 
homeomorphism from X to M an isometry. Thus, though the Euclidean metric on 
R is degenerate and the Euclidean metric on R’ is not, this is not sufficient to 
conclude that (R, 7,) is not homeomorphic to (R’, 7,). For this conclusion, we must 
show that (R*, 7,) admits no degenerate metric. 

If our goal were to show only that (R’, 7,) is not generated by any degenerate 
metric, then we could simply note that this is an immediate consequence of the 
fact that (R’, 7,) is not homeomorphic to a subspace of (R, 7,) and the classifica- 
tion of infinite degenerate spaces. However, since we want to use the fact that 
(R’, 7,) admits no degenerate metric to show that (R’, T,) is not homeomorphic to 
(R, 7,), we now outline a proof not dependent upon this latter fact. 

Suppose p is a degenerate metric on R’ that generates the Euclidean topology 
T,. Pick distinct points a,b, and x from R* such that {a, b} is the longest side of 
triangle {a, x, b}. Thus p(a, x) < p(a, b) and p(x, b) < p(a, b). Using the fact that 
p(a, x) is a continuous function of x, if the ‘““middle point” x is moved slightly to 
x(t), p(a,b) remains the longest side of the resulting triangle {a, x(t), b}. These 
perturbations of the middle point are actually not restricted to slight movements: 
in fact, every point z of R? \ {a, b} is a middle point of {a, z, b}! Suppose there 
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exists some point z of R? \ {a, b} such that {a, b} is not the longest side of {a, z, b}. 
Let the points x(t) slide along a path in R? \ {a, b} from x to z. There must exist a 
first point y on the path for which p(a, y) = p(a, b) or ply, b) = pla, b). Since 
pla, x(t)) + p(x(t), b) = pla, b) for all the points x(t) on the path before y, it 
follows that p(a, y) + p(y, b) = pla, b), and hence y € {a, b}. This contradicts the 
fact that the path avoids the points a and b. Thus, if we select any triangle {a, x, b} 
with x “between” a and b (that is, with longest side {a, b}), then every point of 
R’ \ {a, b} must be “between” a and b. In particular, p(a, y) < p(a, b) for every 
y € R’. This implies that our degenerate metric on R* is bounded by 2 p(a, b) 
since p(x, y) < p(x, a) + pla, y) < pla, b) + pla, b) for any x and y. Our contra- 
diction will come from the fact that this should hold for any points a and Db that 
happen to form the longest side of some triangle. Pick any two distinct points 
z,w € R’ and let € = p(z, w). Let {x,} be any sequence converging to the origin 0 
in (R’, r,). Since { p(x, 0)} converges to zero, we may pick points x; and x, such 
that the longest edge of triangle {x,, x,, 0} has length less than 5. But p is bounded 
by twice the length of the longest edge of this triangle, contrary to the choice of 
e = p(z,w) for some z,w € R*. This shows that (R’,7,) does not admit any 
degenerate metric, and thus is not homeomorphic to (R, 7,). 

Using the topological property “metrizable by a degenerate metric” to distin- 
guish between R and R? is a good exercise in manipulating metrics, but one should 
recognize the dependence of our proof upon another more common topological 
property. The argument that if x is “between” a and b, then every y € R* \ {a, b} 
is “between” a and b utilized the fact that R? \ {a, b} is path connected (relative to 
T,). Removing two points from (R,7,) never gives a path connected space, 
providing a more standard argument that (R, 7,) is not homeomorphic to (R’, 75). 
On the other hand, proving that (R’, T,) is not metrizable by a degenerate metric 
shows not only that (R’, 7,) is not homeomorphic to (R, 7,), but also that (R’, 7,) is 
not homeomorphic to any subspace of (R, 7,). 
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Partitions of Unity for Countable Covers 


Albert Fathi 


This paper should be considered as expository classroom notes for the instructor. 

Existence of partitions of unity for metric spaces is usually proved via some 
rather exacting set-theoretical and topological arguments using the (equivalent) 
concept of paracompactness. Although the standard proof of paracompactness of 
metric spaces is the one given by M. E. Rudin [Ru], in his 1965 PhD thesis, 
Michael Mather showed that it is easier, for metric spaces, to show directly the 
existence of locally finite partitions of unity for arbitrary covers [Ma]. This proof 
does not seem to be so well-known although it has been reproduced in [Bo] (see 
also the appendix of [Do]). It is astonishing that Mather’s arguments do not appear 
in recent textbooks. We hope that the following exposition for countable covers 
will popularise it. An advantage of the method is that the same proof can be used 
in the smooth category. 

Although it is formally covered by the countable case, we will first explain the 
method in the case of finite covers. This will show how easy it is to obtain 
partitions of unity for compact metric spaces. 

For the sake of completeness, let us recall a few definitions. If X is a 
topological space, the support of a continuous function g: X — R is the closed set 
supp(~) = {xle(x) # 0}. A partition of unity on X is a family (¢,),;- ; of continu- 
ous functions ¢g;: X — [0,1] such that L;.,¢x) = 1 for every x © X. Such a 
partition of unity is locally finite if for every x < X there exists a neighbourhood V, 
of x such that {i € I|V, A supp(¢;) # 9} is finite. If (U,);- ; is an open cover of X, 
a partition of unity subordinated to the cover (U,),- ; iS a partition of unity (¢),; <; 
such that supp(¢,) C U, for every i € I. 


1. FINITE COVERS OF METRIC SPACES. Let U,,...,U, be an open cover of 
the metric space X. We wish to construct a partition of unity (¢,), .; —,, Such that 
supp(¢,) is contained in U, for each i = 1,...,,n. 

We start with the continuous functions f(x) = d(x, X \ U,) and we set F(x) = 
_, f(x). Observe that F(x) > 0 everywhere, since U,,...,U,, is a cover of X and 
f; > 0 on U,. Next, we define a continuous function g,;: X — [0, [ by 


g,(x) = max| f(x) a 7 F(*),0 , 

We claim that supp(g;) C U, and ©7_, g(x) > 0 everywhere. 

To prove the first assertion, note that supp(g;) = {xlf;,(x) > F(x)/(n + 1} is 
contained in the closed set {x| f(x) => F(x)/(m + WD}. Since F(x) > 0 everywhere, 
the closed set {x|f,(x) = F(x) /(n + 1)} is itself contained in U, = {x| f(x) > O}. 

To prove the second assertion, write 


n 


n 1 n 
Ea(x) = E (AO) — GFO)) = FO) ~ GF) = 


n+ pF). 


To obtain the partition of unity (¢;), .;.,, define 9x) = g(x)/(L_, g,(x)). 
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2. COUNTABLE COVERS OF METRIC SPACES. Let X be a metric space and 
let (U;), =, be a countable cover. We will modify our argument to show how to 
construct a locally finite partition of unity (¢,); —, such that supp(¢,) is contained 
in U; fori = 1,2,.... 

Define f;: X — [0,2~'] by 


f,(x) = min| d(x, X \ U,),27']. 
Then f; > 0 on U,, f, = 0 outside U,, and 


fill = sup{lfi(x) |x eX} < 27. 


This inequality implies that the function F = L}_,2°‘f, is continuous; moreover, 
since (U,),-, is a cover of X and f, > 0 on U,, we have F > 0 everywhere. 


l 


For i € N, we define the continuous function g,;:X — [0, 1] by 


g(x) = max(fi(x) — 4F(x),0). 

We claim that (g;); -,y is a locally finite family of functions such that supp(g;) € 
U, and Y_,)g;(x) > 0 everywhere. The argument to show supp(g,) € U, is almost 
identical to the one given in the case of a finite cover. 

Let us show that (g;);~ y is a locally finite family of functions. Let x be fixed in 
X. Since F is continuous and positive everywhere, there exists a neighbourhood V 
of x and an € > O such that F(y) > e for every y € V. Let i, be such 2° < €/3; 
from the definition of g, and the fact that || f;||) < 2’, it follows that g,(y) = 0 for 
every y € V and every I > lp. 

It remains to show that for every x one can find an i such that g,(x) > 0. Fix an 
x € X. Since f(x) > 0 for some i and ||f,|lb < 2~”, there must be some i, such 
that fi (x) = sup;_nf(*) > 0. From the definition of F, we obtain F(x) < 
OP 27! )f, (x) = 2f, (x). It follows that g;(x) = f, (0) — 2f, 00/3 = f, 0) 7/3 > 0. 

Finally, define the. family (¢;); ey by 


3. SMOOTH PARTITIONS OF UNITY FOR OPEN SETS OF SEPARABLE 
HILBERT SPACES. Existence of smooth partitions of unity for an open cover of a 
manifold has been established in various contexts. The finite dimensional case is 
well-known; for the infinite dimensional case see [La], [To]. We show that the 
method already given covers the separable case. We will do that for open sets of 
Hilbert spaces for sake of simplicity. 

Let H be a separable Hilbert space (this covers also the finite dimensional 
spaces R”). Since the square of the norm ||: || is a quadratic form, the map 
x +> ||x||° is smooth (= C*) with first derivative bounded on every bounded set; the 
second derivative is constant and higher order derivatives are 0. Composing this 
map with an appropriate smooth map with compact support R — [0,1], we obtain 
the following lemma. 


Lemma 3.1. Jf x © H and 0 <r <:ss, there exists a smooth function o: H > [0,1] 
such that ~ > 0 on B(x,r), @ = 0 outside B(x, s) and {sup||D*¢(x)||| x € H} < © 
for everyk EN. 


Before stating the second lemma, we recall that a Fréchet space is a complete 


Hausdorff topological vector space whose topology is defined by a countable family 
of semi-norms. 
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Lemma 3.2. Let F be a Fréchet space and let (x,),<. be a sequence in F. There 
exists a sequence (€,),,<. Of positive numbers such that for every sequence of scalars 
(A,),en With |A,| < «, foralln €N, the series U,, -yA,X,_ converges in F. 


Proof: Let (p,),,en be a sequence of semi-norms defining the topology of F. Any 
sequence (€,),<, Of positive numbers such that €,max)—,—, p,(x,) < 1/2” will 
do. = 


The space of smooth functions g:H — R with uniformly bounded 
derivatives—i.e., such that supf{||D*p(x)|| |x € H} < © for every k € N—is a Fréchet 
space in which p,(@) = sup{||D*o(x)|| |x € H} is the family of semi-norms. 

Let U be an open set in H and let (U,),-, be an open cover of U. We wish to 
construct a smooth partition subordinated to that cover. Using the separability of 
H, we can find a countable family of balls (B(x,,7,)),-, such that the family 
(B(x,,1,/2),en covers U and for each n there exists an i €J such that 
B(x,,r,) CU, CU. Using Lemma 3.1, let f,: H — [0,1] be a smooth function, 
with all derivatives uniformly bounded, such that f, > 0on B(x,,1r,/2) and f, = 0 
outside B(x,,7r,). Multiplying f, by 2°”, if necessary, we can assume that 
fullo < 2°". 

By Lemma 3.2, we can find a sequence (A,),-,y Of positive numbers such that 
A= LienA, < Mand F=L,eyA,f, is a smooth function that is positive every- 
where. 

Let 0: R — [0,[ be a smooth function such that 6~'(0) =] — ~, OJ. If we set 


£,(x) = 6) f(x) — ya] 


we can check that the family (¢,),, -, defined by 
— &n 
dL K=0 8k 
is a locally finite smooth partition of unity subordinated to the cover (B(x,, 7,,)), en: 
We can now use the following well-known argument to find a locally finite 
partition of unity (W,),-, subordinated to the cover (U,),.;: For each n EN, 
choose i, €7 with B(x,,7r,) CU; and set $; = Lan); jG, We have thus ob- 
tained 


F(x)), 


Pn 


Theorem 3.3. Let U be an open set in a separable Hilbert space. For any countable 
open cover (U;); =. of U there is a partition of unity (¢;); <. subordinated to (U;); x 
with ; smooth, i = 1,2,.... 


Remark 3.4. If M is a o-compact smooth manifold of finite dimension, the space 
of smooth functions on M is a Fréchet space for the C” compact-open topology. 
We can then use arguments very similar to the ones given here to construct smooth 
partitions of unity on M. 


4. ANOTHER WELL-KNOWN CONSEQUENCE OF LEMMA 3.2. Lemma 3.2 
can be used in other instances to give easy proofs of useful facts. We illustrate this 
with the following proposition. 


Proposition. Let M be a o-compact smooth manifold of finite dimension. Suppose U 
is an open set and X is a smooth vector field on U. There exists a smooth function ¢: 
M — [0, [ such that > 0 everywhere on U and (g|U)X can be extended by 0 to a 
smooth vector field on the whole of M. 
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Proof: The space C*(M, R) of smooth functions and the space ¥°(M) of smooth 
vector fields are both Fréchet spaces for the C” compact open topology. Let 
(¢,),en be a smooth partition of unity with compact supports on U. Since 
supp(¢,,) is a compact subset of U, we can extend ¢, by 0 to a smooth function on 
M. In the same way we can extend by 0 the smooth vector ¢, X defined on U toa 
smooth vector field X, defined on M. We can apply Lemma 3.2 twice, to the 
sequence ¢, in the Fréchet space C”(M, R) and to the sequence X,, in the Fréchet 
space ¥(M), to find a sequence A, of strictly positive numbers such that 

7-0 An’, converges in C°(M,R) and L*_,A,X, converges in ¥*(M). Since 
(¢,),en iS a partition of unity on U and AX, > 0, the function g = Y7_,A, ¢, is 
positive on U. Moreover, the vector field Ly _) A, ¢,X,, 1s smooth on the whole of 


n 


M and, on U, this sum is L,_9 A, ¢, X, which is equal to px. a 
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| From fifty years ago in the Montuy... 


Results from the Seventh Annual William Lowell Putnam Mathe- 
matical Competition held May 24, 1947. 


| The second prize, three hundred dollars, is awarded to the 
| Department of Mathematics of Yale University, New Haven, 
| Connecticut. The members of the team were Murray Gell-Mann, 
| Murray Gerstenhaber, and Henry Otto Pollak; to each of these a 
| prize of thirty dollars is awarded. 


| p. 400, vol. 54, 1947 
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Calculus: A Modern Perspective 


Jeff Knisley 


1. INTRODUCTION. An American history curriculum that ended with the Civil 
War would be no more acceptable than a philosophy curriculum that ended with 
Kant. Yet an acceptable history of mathematics curriculum gives little more than a 
cursory nod to the mathematics of the twentieth century. This is not to say that the 
two hundred years following Newton and Leibniz do not deserve seven chapters in 
a history of mathematics textbook, but rather that the one hundred years leading 
up to the present deserve more than one [1]. 

Unfortunately, our entire undergraduate curriculum has the same focus as our 
history of mathematics course—the two hundred years following Newton and 
Leibniz. Our graduates are more prepared for the period in which the steam 
engine replaced the horse than they are for the period in which compact disks 
replaced the vinyl LP. It is no wonder that a reform movement emerged early in 
this century, as evidenced by several articles in [2], nor is it surprising that the 
reform movement is stronger than ever today. Tragically, many mathematicians 
have responded to reform as in [3], where it was lamented that “Mathematics is 
losing its soul. Its priests are pawning it off to a different god.” Such a call to arms 
only reinforces the popular image of mathematicians as the last practitioners of 
some ancient art. 

We know better. We know that mathematics is growing and thriving, fruitful 
and strong. However, the new growth in mathematics is all but absent from our 
undergraduate curriculum. Indeed, our traditional calculus course is packed with 
intellectual deadwood—contrived applications, outdated examples, and obsolete 
definitions. It is time we allowed some new growth in a curriculum that has been 
antiquated for most of this century [7]. 


2. THE NEW GROWTH. Much of modern mathematics is derived from modern 
trends in calculus. Many of the ideas in differential geometry, statistics, and 
numerical analysis are descended from the study of calculus in the present century. 
Correspondingly, any new ideas introduced into the calculus curriculum should be 
cultivated from these modern trends. Technology allows us to incorporate linear 
regression, Markov processes, and probability distributions into an introductory 
calculus course, and‘ when we do so, our students sense that they are studying ideas 
relevant to the world in which they live. 

Similarly, students appreciate applications that have a modern perspective, and 
such applications need not be outside of mathematics [4]. Applications of the 
derivative can be enhanced with the study of cubic splines and Bezier curves. The 
study of Newton’s method can be generalized to the study of fixed point theorems 
in general. The study of spectral theory begins with boundary value problems such 
as 


y" = —-a‘y (1) 
y(0) =y(7) =0 (2) 
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and students genuinely enjoy being shown that (1)—-(2) has nontrivial solutions only 
for integral values of a. 

Finally, allowing such new growth can greatly simplify our efforts at instruction. 
Complex numbers make partial fractions and trigonometric substitutions much 
more accessible. A simple matrix exponential is a great illustration of the power 
series concept. And stating the laws of exponents as axioms for an abelian group 
shows a student that these are more than rules for manipulating superscripts. 


3. THE DEAD WOOD. Allowing new growth into the calculus curriculum means 
something old must go, but such additions mean more than simple pruning. 
Indeed, patching the new into the old destroys the continuity and coherence of the 
original structure. Our present curriculum bears witness to this fact. The applica- 
tions of the integral contribute little to the remainder of the course, and the 
customary chapter on analytic geometry seems misplaced. Sequences are intro- 
duced in the context of convergence of series, and thus it should be no surprise 
that students get the two confused. 

Rather than further fragment our curriculum, we need to transform the calculus 
so that it is a coherent mix of timeless concepts and new ideas. I believe such a 
transformation must address the following: 


Approximation and limits. Our current calculus course relies on several seemingly 
unrelated notions of limit. The e — 6 definition is introduced en route to the 
definition of the tangent line. Infinite limits are defined using e€ and WN sufficiently 
large. Newton’s method is left to intuition. The definite integral is defined using 
the norm of a partition that goes to zero. Numerical integration introduces the 
idea of bounded error. Limits of sequences are defined with the monotone 
convergence theorem. And after spending section after section using limits of 
sequences to develop convergence tests, the idea of a converging Taylor series is 
developed from the remainder formula for Taylor polynomials. No attempt is ever 
made to connect all these ideas of limit into a coherent concept. 

We need to introduce and define the limit so that all our applications of 
approximation and convergence are derived from a single concept. This concept 
will likely have to be a principle rather than a definition. It may be more 
appropriate to explore approximation with a graphing calculator than with a 
formal system of definitions and theorems. 


Intuition and Rigor. We prove that the Mean Value Theorem is a consequence of 
the extreme value theorem, but we do not prove the extreme value theorem itself. 
Instead, we argue that the extreme value theorem is intuitively obvious. The reason 
for such an intuitive appeal is that the proof of the extreme value theorem depends 
on the Heine Borel Theorem, and the Heine Borel Theorem depends on the 
topology of the real line. Thus, the proof of the Mean Value Theorem is not a 
proof at all. We might as well argue that the Mean Value Theorem is intuitively 
obvious and skip its proof altogether. 

But we should not skip the Mean Value Theorem altogether. The point is that 
an introductory calculus course is not about theorems but is rather about defini- 
tions. If the difference operator 


Af(x) =f(x +h) — f(x) 


could be used easily, then we would not even need the concept of a limit. However, 
our computationally attractive chain and product rules come solely from the 
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definition 


f(x +h) — f(*) 
———. 


f'(x) = lim 


Good theorems are the stuff of graduate courses. Good definitions are the stuff of 
introductory calculus. 

In my mind, this all but eliminates the traditional Riemann definition of the 
integral. It requires too much time and machinery—Riemann sums, the norm of a 
partition, the arbitrary choice of a point in a subinterval—and it is machinery that 
will not be used again. In contrast, Lesbegue’s definition of the integral is 
intuitively simple and can be stated without a lot of machinery. The integral of a 
simple function is a picture. It can be computed using a table. The integral is a 
measure of simple function approximation. Simple functions can be used later in 
developing the integral test for convergence of a positive term series. 


Technology and Modern Science. The computer was developed by mathematicians 
like Von Neumann for mathematicians like us, so it is absurd that mathematicians 
would not enthusiastically embrace their own creation. Technology should be a 
tool we welcome with excitement—no more rigging problems so the algebra comes 
out right. If a problem requires the solution to 


6x° + 3x7+1=0, 
then the student can pull out the trusty graphing calculator, estimate the roots 


graphically and use a root finder to polish off the answer to the desired number of 
decimals. We should forget extremum problems with functions like 


f(x) =x +x? +x, 
and instead to ask them to find the extrema of functions like 


x60? + 347 +1 


f(x) = i Arnel 


dt. 


As a result, calculus can be presented to the student as the true foundation of 
modern mathematics and science instead of as a hodge-podge of problems re- 
stricted to angle multiples of 30° and 45°. Indeed, modern science is more than the 
study of classical mathematics. It runs parallel to modern mathematics and often 
intertwines with it. Technology means that regression and curve-fitting can be 
studied in a calculus course and then immediately applied in a chemistry course. 
Our debate should be about how we should use technology, not if we should use 
technology. 


4. CONCLUSION. The present dilemma is both fortunate and obvious. Calculus 
is modern, but our calculus curriculum is not. Calculus will continue to prosper 
with time and technology. We can take comfort in knowing that mathematics will 
continue to enlighten and enliven the minds of generations to come. 

But poetry does not have to be taught by poets, and likewise, mathematics need 
not be taught by mathematicians. Our traditional course barely even tests the 
abilities of software tools such as Maple and Mathematica, and the days when 
every student carries a laptop are not that far off. Already our first semester 
calculus course is little more than a supplement to the high school curriculum. 
Without a modern perspective and an enthusiasm for the technology we mathe- 
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maticians created for ourselves, our relevance to society will continue to dwindle. 
It will be only a matter of time before we mathematical horsemen are replaced by 
the intellectual equivalent of the steam engine. 
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Book review from the Montuty fifty years ago... 


College Algebra. By A. A. Albert, New York and London, McGraw- 
Hill Book Co., 1946. 12 + 278 pages. $2.75 


| 
| In his introduction the author justifies his new textbook in college 
| algebra and one can do no better than to Iet him present his own 
| case: “College algebra has been a most abused subject. The time 
allotted to it is frequently inadequate for a genuinely good 
treatment, and indeed the entire course is sometimes omitted. This 
is due partly to a desire to bring students to a study of the calculus 
as early as possible. It is also duc partly to the presentation of 
college algebra, in all texts thus far published, as a collection of 
seemingly unrelated topics. The desire to teach the calculus as carly 
| as possible tends to defeat its own ends. The building of a course in 
| the calculus on what must be a weak foundation cannot result in a 
good student understanding of the subject. There is also no reason 
why the material of college algebra cannot be cohesively organized.” 


p. 174, vol. 54, 1947 
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Pro Choice 


Arnold Ostebee and Paul Zorn 


INTRODUCTION. This essay is “pro reform.” But before arguing for “reform,” a 
few important words about the word itself. First, we acknowledge some discomfort 
with the value-laden term “reform”; such judgments are better left to the reader. 
But the word is now in common use, so we'll adopt it and its variants without 
further apology, and usually without quotation marks. Second, we observe that 
there is no such thing as the reformed approach to calculus. We self-styled 
reformers do indeed tend to subscribe to some common broad goals and tenets, 
such as the primacy of conceptual understanding, the pedagogical value of multiple 
representations, and the potential of technology to improve both pedagogy and 
content. On the other hand, there is no party line—let alone unanimity—among 
reformers on much else. Even a cursory comparison of several reform texts shows 
that there is no single reformed position on convergence and divergence of 
numerical series, no reform-approved list of integration techniques, no official ban 
on lecturing, no accepted dogma on implicit differentiation, and no enforced taboo 
on the mean value theorem. 

Individual reformers and texts do, of course, take positions on most of these 
questions: a text either covers trigonometric substitutions or it doesn’t. That 
different reformers make different choices—indeed, reformed approaches offer 
much more diversity of choice than do traditional texts—is no valid criticism of the 
reform idea in general. On the contrary, the willingness of reformers to make hard 
choices is an important virtue of reformed approaches to calculus, and the one we 
praise in this essay. 


Facing choice. Teaching calculus, “reformed” or not, will always be a matter of 
choices: what to cover, what to prove, what to test, what to assign, what to say, 
what to imply, what to skip, which tools to allow, which to forbid. We may not 
consciously acknowledge or even recognize these choices, but they are inescapable, 
and they have consequences. If we study hypergeometric series deeply, we may not 
get to numerical integration—and vice versa. Is Euler’s method a fair trade for 
partial fractions? Is a proof of the chain rule worth the same amount of time spent 
understanding what it says? Is the course itself an introduction to mathematical 
analysis, or is it an‘introduction to calculus-based tools and their uses? 

The key point is simple: the inevitable choices we face—as “reformers” or as 
“traditionalists”’—are very often among competing goods. (In the heat of mathe- 
matico-political debate, combatants often caricature the choices as between good 
and evil.) Related rates problems, for instance, figure on many reformers’ hit lists, 
and for good reason: too often, the problems are contrived variations on a few 
rigid themes. But there’s another side to the story. Understanding derivatives as 
rates is a standard tenet of calculus reform, so the idea of related rates—relating 
varying quantities to the rates at which they vary—can be seen as a natural and 
productive extension. In other words, the topic of related rates is intrinsically 
neither good nor bad; what matters is what we choose to make of the topic, and 
why. 
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The need to make hard pedagogical choices in calculus has nothing special to do 
with a reformed approach to the subject. A willingness to make hard choices, 
however, does seem to us to be a special virtue of reformed approaches. As 
textbook authors we know well (and may sometimes even have succumbed to) the 
temptation to choose some of everything. With world enough and time that 
Strategy might work, but the real world of calculus is too large, and the time 
available too limited, to permit that luxury. 


How much rigor? As examples of hard choices, consider some questions of rigor: 
Should important theorems and definitions be stated in full generality? Should 
they be proved in the same spirit? 

A mathematician’s natural and (usually) healthy impulse is, when in doubt, to 
prove everything, as generally as possible. Granted, no modern elementary calculus 
text follows this policy consistently, but its indirect effects are clear: most tradi- 
tional texts contain too many proofs. “Reformed” texts tend to be more selective. 
Some of these proofs are quite subtle, although fine points may be finessed with 
judicious disclaimers. Even if the proofs elude most students (goes one argument), 
some of the best will understand, and the rest will at least have “seen” the proofs. 
These treasured seeds, though little watered now, may someday sprout. 

This “broadcast” strategy fails on both mathematical and pedagogical grounds. 
Careful proofs, even of the “simplest” theorems in calculus, often depend on quite 
subtle ideas from analysis, such as the persistence of equalities and inequalities 
upon taking limits. In addition, the lesson many students draw from half-under- 
stood, force-fed rigor is damaging and discouraging: We aren’t even supposed to 
understand this stuff, and more like it is probably on the way. 

Many reformers (ourselves included) choose simply to skip most formal proofs 
in elementary calculus. To put it more politely, we “defer” such proofs until later 
courses in analysis. This policy has, at the very least, the virtue of 
forthrightness—there is no pretense of presenting a logically airtight development 
of the subject. Just as important is the pedagogical calculation involved. The time 
and energy even “easy” proofs take is often much better spent helping students 
understand what a few important theorems really say: the difference between 
hypotheses and conclusions, what goes wrong without a key hypothesis, whether or 
not the converse holds, how the theorem answers a “natural” question, etc. The 
goal of understanding theorems, by the way, is much too often taken for granted. If 
you doubt it, try asking Calculus II students to state—not prove—any important 
theorem from Calculus I. Then stand back and watch the syntax fly. 

The concrete -approach to rigor we take in our own calculus textbook [1], is to 
omit most proofs, but to present a chosen few (very few) in some detail, as a short 
but moderately serious excursion into new mathematical terrain. We try to explain, 
for instance, that while the implication 

f’ = 0 = f = constant 
may seem obvious, it really requires proof, and the proof is surprisingly subtle. It 
depends on a version of a mean value principle; we prove one in some detail. The 
result we aim for, in the short run, is not primarily to settle students’ “live” 
questions of logical validity, but rather to introduce students, briefly but honestly, 
to an important area of mathematical culture. Later, we hope, many students will 
revisit the area. 


The hardest choices. As textbook authors we and other reformers have had to face 
many of the hard choices implicit in designing calculus courses; quite properly, 
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we've faced them in different ways. Authors don’t make every choice, of course. 
Some are best made by individual teachers; still other choices, we admit, can be 
finessed or evaded by dint of appendices, “optional” sections, and other devices of 
creative publication. Hardest and most important are the highest-level choices, 
such as those related to a course’s general goals, its target audience, and the 
“toolkit” it employs. These choices reverberate everywhere throughout a course. 

Consider, for instance, the molten-button issue of building skill and speed with 
paper-and-pencil symbol manipulation, and its countless corollary questions: What 
traditional manipulation techniques are really important? Are some techniques 
obsolete in the age of the TI-92 and its inevitable successors? How should by-hand 
and mental calculation be balanced? Is symbolic manipulation facility (by hand or 
by head) essential for conceptual understanding? Do we learn by (physical) doing? 
Must we learn this way? 

Reasonable people, reformers included, can...and do...and should... differ, 
significantly, on all of these questions. We say “reasonable” with full sincerity: To 
downplay the algebraic viewpoint, even radically, in the limited agenda of an 
elementary calculus course is not to deny its importance in the larger mathematical 
sphere. By the same token, there is nothing inherently “mindless” about symbolic 
drill, any more than with any other form of mental or physical exercise. The key, at 
either extreme or anywhere in between, is a clear and balanced view of a course’s 
larger goals and intended audience. 


Conclusion. Calculus reform is not a single alternative to a standard diet; reform 
offers a diverse menu of different but carefully-considered choices. The greatest 
lasting value of calculus reform may well be in highlighting and forcing important 
choices among competing goods—choices that have always been present but too 
little acknowledged. In the end, choosing everything amounts to choosing nothing. 
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News Items from the Montuty twenty-five year's ago... 


Associate Professor G. L. Alexanderson, University of Santa Clara, 


has been promoted to Professor. p.818 | 

In spite of this, the Association will not increase dues for 1973. | 

They will remain at $12.50 for individual members. p. 580 | 
... Vol. 79, 1972 
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Rethinking Calculus: Learning and Thinking 


James J. Kaput 


1. REMODELING CALCULUS THE INSTITUTION. Surely the renewal of Cal- 
culus is a good idea, one good enough to attract the attention and energy of many 
good people. But this is Calculus the Institution—that peculiarly American aca- 
demic event and all its supporting structures and expectations. Professor Knisely, 
however, barely hints at matters of institutional implementation, so I conclude that 
he is addressing Calculus, the System of Knowledge and Technique. As such, his 
paper is, perhaps, a warm-up exercise to a deep and long overdue reconsideration 
of the appropriate intellectual content of Calculus, one that has been postponed 
while we attempt to remodel Calculus the Institution. 

This remodeling has proven to be an arduous task for two reasons: (1) the 
renovation is taking place whilst the owners and stakeholders continue to inhabit 
the institution (a constraint applying to most educational reform); and relatedly (2) 
we have left all the larger structural features of the institution intact, including 
those features that connect it to the outside world, e.g., to the rapidly changing 
K-12 education. The basic architecture and its place in the larger world are untouched. 
I suggest that we embark on the more fundamenta! rebuilding towards which 
Knisely points. In so doing we need to come to terms with the relations, existing 
and possible, between Calculus the Institution (C-INST) and Calculus the System 
of Knowledge and Technique (C-KNOWL). And we need to look more deeply and 
critically at the assumptions, largely tacit, that hold the status quo in place and 
provide some concrete, implementable alternatives. 


2. RELATIONS BETWEEN C-KNOWL AND C-INST. The key relation of interest 
to me involves learning and cognition. How can ideas and techniques become 
knowable and usable by those who need to know and use them? And who needs to 
know them, and in what ways do they need to know them? But before we can get 
to these questions, we must review the other key relation between C-KNOWL and 
C-INST, the historical one. 

C-INST is the product of several centuries of evolution. The curriculum and 
texts are rooted- in C-KNOWL, which developed at the hands of masters in the 
17th and 18th centuries. Many basic curricular structures set down in textbooks by 
L’Hopital, the Bernoulli’s, Euler, and their contemporaries, have remained largely 
invariant through the 20th century—for a very good reason: they served traditional 
purposes and populations extremely well. Indeed, this presentation of C-KNOWL 
is at the foundation of our civilization’s scientific and technological infrastructure. 
While C-KNOWL evolved into an almost sacred academic tradition [7], the 
ambient societies, the nature of education, and the relations between education 
and the larger society, including and especially in the United States, changed and 
continue to change profoundly. It is worth noting that, according to Department of 
Education statistics [13], the percentage of students taking AP Calculus today is 
equal to the percentage graduating from high school a century ago! And, as 
recently as the 1950’s, immediately prior to the huge increases in US access to 
higher education, calculus was commonly preceded by a preparatory course even at 
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elite universities. Our expectations regarding who can learn what surely change 
with the times. The C-INST we know today, while connected to a venerable 
C-KNOWL, is a relatively recent artifact. Its increasing dysfunction and ill-fit with 
the new circumstances, especially technological ones, are what gave rise to the 
Calculus reform movement—the remodeling of C-INST. 


3. CALCULUS REFORM FOR THE OTHER 90%. The fixing of C-INST serves 
only 10% or so of our population, the socio-economic and intellectual elite. The 
population at large—4 million in each age-level cohort—continues to be denied 
access to the key ideas of C-KNOWL, a quietly accepted national oversight, 
despite the fact that we collectively spend billions of dollars supporting a curricu- 
lum largely aimed at calculus [9]. At the same time we congratulate ourselves on 
increasing percentages of students passing AP Calculus—to perhaps 3-—4% of a 
given grade-level cohort [13]. Not only do we ignore 90% of the population’s 
calculus learning, the 10% on whom we do focus actually need to learn much more 
mathematics of change and variation than C-INST currently allows them to 
encounter, for example, the mathematics of dynamical systems: the ideas, repre- 
sentations, and skills needed to make sense of nonlinear phenomena—the mathe- 
matics that flourishes in the computational medium [1], [11]. And perhaps even 
some of the ideas Knisely suggests. 

This is the background against which I wish to discuss learning and thinking of 
the mathematics of change and variation, including calculus. It is not enough to 
toss around names of ideas, procedures, and relationships that exist in the formal 
cultural record of mathematical achievement and in some form or other in the 
minds of that super-elite constituting professional mathematicians. As the late 
Morris Kline reminded us, the post-hoc logical structures of these mathematical 
products may have little to do with the structure of the experiences that students 
need in order to build viable versions of them in their minds [5]. The kind of idle 
speculation or assertion about the former, without regard for the latter, that 
appears in Knisely’s article represents a general and entirely understandable 
tendency in our community to think and plan in terms of the cultural artifacts and 
language that we inhabit. The alert reader will have noticed that I indulged in a bit 
of this in the previous paragraph when mentioning dynamical systems. As episte- 
mological Flatlanders, we don’t distinguish between “up” and “North.” But at 
some point, preferably earlier rather than later, our analyses must turn to learning 
and thinking, to the conceptual, cultural, and experiential roots of our mathemat- 
ics—we must break our mind-forged manacles and look up. 

Each word or phrase that we use to denote some mathematics is but a pointer 
to a thick web of ideas and relationships, treasures hard-won by great mathemati- 
cal minds and requiring even greater struggles to be understood by more ordinary 
minds. I suggest that when we take learning and thinking seriously, we quickly dig 
to the foundations of our discipline. Simultaneously, we begin the rethinking that 
reaches beyond remodeling C-INST to build the extended means by which the 
neglected and under-served 90% may come to know some of the classical C- 
KNOWL, by which our favored (and important) 10% may come to know it more 
deeply, and by which both groups may come to learn an even broader and richer 
mathematics of change and variation. Such rethinking requires an openness to new 
organizations of ideas, new notations and ways of acting on notations, new uses of 
interactive technologies that reach beyond the CAS’s designed to facilitate or 
supplant traditional adult competencies with formal symbols, new time scales for 
learning big ideas (years instead of months), and an enriched conception of what 
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counts as legitimate mathematical thinking. Our current work is beginning to shed 
light on the transformative power of such ideas as change and variation to 
contextualize and organize many of the ideas and skills in K-12 mathematics 
already regarded as important, and to reveal the efficiencies in curricular organiza- 
tion that are required in order to make room for the new mathematics needed by 
students living their lives into the second half of the 21st century. 


4. RETHINKING CALCULUS—AN ILLUSTRATION. To illustrate I sketch 
briefly some approaches developed in the ongoing SimCale Project, with no 
pretense that it is complete or definitive. We began with a combination of 
historical analysis that examined attempts by the Scholastics to mathematize 
change before algebra was available [4], the large literature on students’ difficulties 
with kinematics [8] and graphs [6], and a view that new technologies could be used 
to support learning that is more foundational than learning facility with traditional 
notations. Rather, we wished to build the ideas to which these notations conceptu- 
ally refer, the ideas that they are “about.” We asked how the key underlying ideas 
of rate of change, accumulation, the connections between variable rates and 
accumulation, and approximation all might be made sensible to as young and 
diverse a population as possible. This initially meant the middle school grades, but 
has more recently involved the elementary grades. Following the historical lead 
and recognizing that the language and metaphors of motion are used quite 
generally to describe change and variation [2], we focused (although not exclu- 
sively) on mathematizing linear motion, particularly by controlling motion simula- 
tions in familiar or fanciful situations: elevators, people walking or dancing, cars, 
duckies on a pond, boats in a river, space-vehicles, and so on [10]. 

Our starting criteria were to begin with students’ intuitive experience with 
velocity, to minimize computational complexity, and yet to maintain sufficient 
variation to avoid the conceptual degeneracy of constant velocity [12]. These 
criteria led to extensive use of piecewise constant velocity functions. Furthermore, 
we wanted to support direct graphical manipulation of these velocity 
functions—after all, defining and manipulating piecewise constant functions alge- 
braically is a very cumbersome process. 

These considerations lie behind the situation depicted in Figure A, where the 
graphs on the right side of the figure drive what we usually call “jerky elevators” on 
the left. Here the student is dragging vertically at the arrow a constant velocity 
segment (currently at height one floor /sec) in order to make the elevator named 
“Right” get to the same floor at the same time as elevator “Left,” which has a 
discontinuously’ decreasing velocity (step) function. In this particular case, if 
“snap-to-grid” were turned on, forcing all values of time and velocity to be 
integers, the student could not succeed (9 floors in 7 seconds). Further, the Mean 
Value Theorem’s continuity hypothesis is violated, of course, and its conclusion 
fails. If a linear velocity function had been used instead of the staircase, then the 
student would be building an instantiation of Merton’s Theorem [3, p. 86]. 

Although it is difficult to build a case on such a narrow shard of curricular 
activity, this mathematically mundane problem-situation illustrates several aspects 
of rethinking subject matter as it is experienced and learned by students. First of 
all, three key underlying ideas—constant rate, mean-value, and area under a 
rate-graph—are directly and enactively addressed at a level sensible for upper 
elementary age students. Second, these ideas are approached graphically rather 
than algebraically, with a tight referential relationship to motion phenomena. 
Velocity, position, and acceleration graphs in this approach are not only linkable to 
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Figure A 


each other (or to tables or equations), they also provide three different descriptions 
of readily viewable and controllable phenomena. That is to say, unlike much school 
activity based on the “Big Three” representations (numeric, graphic, and algebraic), 
they represent something other than each other! Their primary referential relation is 
to the phenomena. Third, accumulation, via simple arithmetic sums, is addressed 
before the subtle ideas of rate and slope. 

We could also approximate the staircase by dragging a linear function into place 
—another interesting reversal of the usual direction, which is to approximate 
continuously varying functions by discretely varying ones. In our case the approxi- 
mation-error can be directly computed and predicted by 5th graders, and can be 
tested by observing the final (or intermediate) positions of the elevators. 


5. AN EARLY START: BUILDING ON KINESTHETIC EXPERIENCE. Young 
children (grades 2-5) first meet mean values in physical activities. Students have 
pre-quantified notions of their own “slow,” “medium,” and “fast” speeds that they 
enact on a marked section of classroom floor. A pair of students is to move along 
the marked line such that one (A) moves for 2 seconds at “slow” speed, then 2 
seconds at “fast,” and then 2 seconds again at “slow;” the other student (B) is to 
move at a constant “medium” such that she reaches the endpoint at exactly the 
same time as A. Perhaps after a practice run where B tries, at roughly constant 
speed, to cover the distance that A traveled in 6 seconds, they try to perform this 
task in parallel. Great struggles ensue as B tries not to be influenced by the fact 
that at first she is ahead of A and then is behind A. The observing students shout 
“Constant speed!” “No changing!” and so on, while B fights her own kinesthetic 
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sense to slow down, to catch up, etc. She is learning, at a very fundamental 
physical and personal level, a version of constant rate, and perhaps less directly, a 
sense of average speed, that serves as a foundation for understanding constant 
functions and linearly increasing distance in their various more formal representa- 
tions to come, including those encountered in simulations such as in Figure A. 
Indeed, children’s physical experience is difficult for them to quantify but is 
kinesthetically rich, whereas simulations can be made quantitatively rich, although 
are kinesthetically vacuous. Hence we engage students in parallel activities on the 
computer, where they begin with piecewise constant velocity functions labeled only 
as “slow,” “medium,” and “fast” (one, two, and four floors /sec, respectively), and 
gradually move to more quantitative problems and methods, first graphical and 
numeric and, eventually, algebraic. 

Another connection with physical motion is available through the use of motion 
sensors, which in microcomputer based labs have been used to import and then 
graph quantitative aspects of phenomena, approximate the data via curve-fitting 
techniques, etc. In SimCalc MathWorlds [10], it is possible to attach the motion-data 
to an object and replay it, or edit the motion, or, as was done in Figure B, create a 
series of motions that relate in some interesting way to the original motion. Here 
the Frog-character, with the dashed position graph representing a student’s actual 
imported motion, is leading a “Clown Parade” where the clowns were given 
position functions synthetically. Note the “class-clown” outlier who is marching to 
her own drummer. (Software graphics are in color, allowing color-coded graphs 
and actors). The next generation of this software will support uploading of 
functions from hand-held devices, so each student in the class can control a 
character in the simulation—they can be in the parade! 


Figure B 
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6. ONGOING INVESTIGATIONS: FORMALIZATION. An important question is 
how to get from the informal understandings discussed in preceding sections to a 
more formal calculus that supports the symbolic technique that some students 
need. We closely study student learning and thinking over extended periods as the 
students solve problems and work in computer environments of the types described 
earlier. Their work both lays the base for, and then actually builds, the many ideas 
at the heart of C-KNOW. As such ideas are being solidly established, their 
formalization, which is the source of enormous power for those who have mastered 
it but a great difficulty for many students at all levels, becomes a much more 
tractable matter. Two formalization strategies are available. One involves begin- 
ning with an extended, graphical pre-algebraic mathematics of change experience 
in the earlier grades and then superimposing algebraic notation upon what is 
already understood graphically, representing the graphs and phenomena via the 
usual classes of functions. The second involves co-learning algebra and the ideas 
that it can represent and manipulate, including rule-based descriptions of motion 
and change. The second strategy, which perforce must begin in the early grades, 
introduces and reveals the computational power of the formalisms early and often 
—as was the case historically. Both strategies provide an experiential anchor for 
some of the basic functions, their derivatives and integrals. And both offer a major 
departure from remodeling C-INST for the privileged 10% to building new 
curricular structures to include ALL students. 

The combination of factors that led to Calculus reform applies across the 
mathematics curriculum, as does the inadequacy of top-down, university-centric 
reforms. Just as previously successful rote-based learning and coping strategies 
prove inadequate for students as the mathematical challenges become more 
substantial, inherited curricular strategies and technologies prove inadequate in 
the face of our historic challenge to teach much more mathematics to many more 
people than ever before. And, to go a step farther, our reform strategies may prove 
inadequate as well. We now must reach beyond the confines of the university, and 
draw upon deep and detailed understandings of mathematical learning and think- 
ing across many age levels and types of students. This is neither as easy nor as 
convenient as speculation about neat new topics, but without more foundational 
work our, and our students’, real alternatives will remain limited. 
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Problems from the Monruty fifty years ago... 


4235. Proposed by Irving Kaplansky, University of Chicago, and D. C. Lewis, 
University of New Hampshire 
Show that the determinant 


(x- 1/1 (x? - 1)/2 . (x" —1)/n 
(X= W/2 0 GEL DZS eG Dn 1) 
(x"=1)/n (x -D/(n +1) (2! 1)/(2n- 1) 


. . ? 
is a constant times (x — 1)’. 


4176 [1945, 522]. Proposed by H. S. M. Coxeter, University of Toronto 

Prove the following two theorems in affine geometry of three dimen- 
sions: 

(a) If all the faces of a convex polyhedron are parallclograms, thcir 
number is the product of two consecutive integers. 

(b) If cach face of a convex polyhedron has a center of symmetry, the 
whole polyhedron has a center of symmetry. 


E735 [1946, 394]. Proposed by Paul Erdés, Stanford University 

Six points can be arranged in the plane so that all triangles formed by 
triples of these points are isosceles. Show that seven points in the plane 
cannot be so arranged. What is the least number of points in space which 
cannot be so arranged? 


4252. Proposed by Paul Erdés, Syracuse University 
It is well known that 2n!/n!(n + 1)! is always an integer. Prove that for 
every k there are infinitely many n’s such that 2n!/n'\(n + k)! is an 
integer. 
E768. Proposed by Irving Kaplansky, University of Chicago 
A number n has the property that for any p <q <n, 
S=p+(pt+1)+-:: +4 


is never divisible by n. Show that this is true if and only if m is a power 
of 2. 


... Vol. 54, 1947 
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What Do We Do About Calculus? 
First, Do No Harm 


Richard Askey 


In memory of Chih-Han Sah 


In the spring of 1994, the Dean of our Engineering School paid the expenses of 
four speakers to tell the Mathematics Department how to teach calculus in a 
modern way. To him, a modern way was intensive use of computers. The real goal 
was to have us teach the same amount of calculus but with fewer credits, so that 
more of the time of engineering students could be spent taking engineering 
courses. A joint committee was set up to look at what has been taught and what 
needed to be taught. The conclusion was that with heavy use of computers it would 
take more time to teach the same material rather than less, so nothing came of the 
push to cut the number of credits. 

I asked the first speaker about proofs. He replied that calculus was not the 
place to do proofs. Proofs should start in the junior year of college, primarily for 
students who are mathematics majors. 

There was an education meeting held at the University of Chicago. A Japanese 
education official said, “About half of the ninth-graders could express quantitative 
relations using letters (variables) and could write geometrical proofs” [8]. I asked if 
ninth grade Japanese students could learn how to do proofs, why couldn’t our 
calculus students also do this? While we have a higher percentage of students 
taking calculus now than we did forty five years ago when I was a young college 
student, we do not begin to have 50% of the age cohort taking calculus. Saying that 
we have so many more students taking calculus that we cannot possibly expect 
them to be able to do this is looking at the wrong comparison group. In addition to 
proofs in geometry, there are other proofs in Japanese middle school books, such 
as a proof that the square root of 2 is irrational. See [7]. 

The second speaker talked about differential equations, and began with this 


equation: 
‘4 x’ =x?—-t 


with an initial,condition. Once this was put up on an overhead, I worked out the 
solution. The speaker said this was a differential equation that could not be solved 
[exactly], and I let this go by without saying anything. After talking on other topics, 
the speaker came back to this, her favorite equation, and put up an overhead, 
which seemed to show a pole. She said that a pole was there and that a colleague 
had shown this. This equation is just a Riccati equation, so can be linearized, and 
the linear equation solved. In the present case, the linear equation is the Airy 
equation, so a solution is easy to find. Riccati equations are important in control 
theory, and there were electrical engineers in the room, so I did not want them to 
think that the mathematicians did not know what was happening. I asked if they 
then solved the equation to explain where the pole comes from. She repeated what 
she had said earlier about a colleague having proven that the equation could not 
be solved. I said that it depends on what is meant by solved. In the present case, 
the solution is easily found in terms of Airy functions, which are Bessel functions 
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in a slight disguise. The pole that was claimed to exist comes from the smallest 
zero in the denominator. 

There is a tendency to downplay the role of infinite series in calculus and in 
differential equations. The usual argument for differential equations is that it is 
hard or impossible to see the long range behavior from a power series. In the 
present case, it is the smallest zero that is in question. The other argument given 
against power series solutions of differential equations is that few differential 
equations have solutions that can be written in the form of a nice series. It is one 
of those miracles of nature that many very important problems lead to just those 
differential equations that can be solved in series with nice coefficients. These 
series have the property that the term ratio of the coefficients is a rational function 
of n, and are called hypergeometric series. A course in calculus is not the place to 
study hypergeometric series in detail, but the most important one, the binomial 
theorem, should be there. The ratio test for convergence is as popular with 
students as it is because it is easy to compute the limit of a term ratio that is a 
rational function of n, and many of the elementary functions studied in calculus 
have power series of this type. Students should start to be led in the direction of 
seeing that this class of functions is important. 

The third speaker was someone I have known for years, so I asked some 
questions in an e-mail before he arrived. One was about differentiating x”. This 
can be done in several different ways. The traditional one in our texts was to quote 
the binomial theorem to get started. This used to be a standard topic in algebra. 
One of the new calculus books does it this way, and refers the reader to any high 
school algebra book for a proof of the finite binomial theorem. I called and asked 
one of the authors if he had looked at any high school algebra or precalculus books 
recently. He said no. He should, for the binomial theorem is no longer the staple it 
once was. For example, the precalculus book written by the faculty of The North 
Carolina School of Science and Mathematics [1] does not have either the binomial 
theorem or the geometric series. In response to my question to readers of the 
e-mail discussion group calc-reform, someone replied that most of his students had 
taken calculus in high school. If there is anything students remember from high 
school, it is the formula for the derivative of x”, so he does not give a derivation. 
Many people who are supporting the current reform efforts do not like formulas, 
and so do not want to use the binomial theorem to differentiate x”. One solution 
to this problem is to use another formula. Instead of writing 


(x +h)" — x" 


h 
it is possible to write 
y" — x" 
y-x 
or even 
(qx)" — x" a ie 
AS yn 2 
qx —X q-1 


and use the sum of a finite geometric series. This argument also works when n. is 
rational. In the course of changing variables to see this, you give an introduction to 
the chain rule and to the simple form of l’Hospital’s rule. 


1997] WHAT DO WE DO ABOUT CALCULUS? 739 


However, some of those who do not like formulas even object to the formula for 
the sum of a geometric series, so there is a way to differentiate x” without using 
any formula. Just observe that 


(x +h)" =(xth)(x th): (x +h) 


and observe that x” appears once. The next term, hx”~', appears once for each 
factor, so n times. Every other term in the expansion has at least two factors of h. 
This way the student can understand why 


(x +h)" =x" + nhx"~! + terms that involve h? or higher powers of h. 


Contrast this with the treatment in [6]. The formula for the expansion of (x + h)” 
is stated when n = 2,3,4,5. Then the authors write “we can say that (x + h)” = 
x" + nx""'h + terms involving h* and higher powers of h”. There is a big 
difference between “we can say that” and “we see why”. Mathematics should be 
an open subject, where students do not take such simple facts because “we can say 
that” or because the computer algebra system gives such a formula. 

Another of my concerns can be illustrated by a problem in the same book, but 
other books could have been used equally well. This deals with when something 
has been shown to be true. 

Consider problem 48 on page 365 of [6]. This has three parts. The first is to use 
Riemann sums to evaluate the integral from 1 to 2 of In x. The second is to 
evaluate this integral using anti-derivatives. The same integral had been done in 
the text, but from 2 to 3. Both of these parts are fine, except it would have been 
better not to use “evaluate” in the first part, but “approximate”, and it would have 
been better for the students to have been asked to do an integral that had not been 
done in the text in the second part. However, it is the third part that bothers me. 
The students are asked to “Explain in words why your answers verify the Funda- 
mental Theorem of Calculus’. This has not “verified” the Fundamental Theorem 
of Calculus, but has illustrated that the approximation in the first part gives an 
approximation to the exact value obtained in the second. The first definition of 
“verify” in the dictionary at my desk is: “To prove to be true”. Words mean things 
and they are important. Meanings should not be changed without very good 
reasons. 

The fourth speaker tried to show us how a computer algebra system could be 
used in a lecture setting. One of his main examples was Simpson’s rule. He set up 
the problem, got three linear equations in three variables, and said that these were 
far too complicated to solve at the board. He then displayed the solution via a 
computer algebra system. I asked him why he felt it was necessary to do the 
interpolation at the points a, a + h, and a + 2h, when you could do the interpola- 
tion at —h, 0, and h, or even —1, 0, and 1. Then the equations fall apart and it is 
very easy to do the algebra by hand. There is an important mathematical lesson 
taught when doing this: you can adapt the coordinate system to the problem at 
hand. That lesson needs to be learned whether you do calculations by hand or by 
computer algebra. 

In talks on mathematics education, I frequently start with four guidelines that 
should be considered when teaching, writing a book, or developing a curriculum: 


¢ Do not lie to your students but don’t tell them the full truth. 
¢ Some results in mathematics are more important than others and this should 
be reflected in texts and in class. 
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e Mathematics is not a secret guild where something is true because I say it is or 
because a computer algebra system says it is. When something simple and 
important is studied, reasons should be given. 

¢ Words are important and their meanings should not be changed without very 
good reasons. 


Examples where these have not been followed have already been given. There 
are many more in newer books. Since first mentioning these, I have decided to add 
one, which is very important for textbook writers and curriculum developers to 
observe. 


¢ Be careful that what you are doing does not lead others to make changes that 
will hurt the long-term education of students. 


In his book [5] on textbooks written for the TIMSS study, Geoffrey Howson 
makes the following point: “The passing of the 1960s emphasis on algebraic 
structure need not be regretted. What is sad is that it has not been replaced by 
some other clear philosophical or pedagogical structure more appropriate to 
school mathematics.” He ends this paragraph with “A first attempt to establish 
such a framework of ‘recurring themes’ has been made by Gardiner [4]. It is an 
idea which deserves further consideration, development, and elaboration.” 

In the absence of such guidelines, textbook writers, curriculum developers and 
test writers will look at the current curriculum developers and test writers will look 
at the current curriculum and try to provide material that will get students ready 
for later courses. Thus, one frequently overlooked point is how changes being 
made for one reason will impact in other ways. The newer calculus books tend to 
be more qualitative, and this is starting to show up on the AP Calculus exams. 

For various reasons, which will not be listed here, the knowledge of arithmetic 
and algebra that students starting calculus have has fallen. As a response to this 
poorer knowledge of algebra, the Harvard Consortium has tried to finesse the 
problem by emphasizing the use of graphing calculators. Other reasons are given 
for this, but a quotation from Tony Phillips suggests that this was a major factor. 
After saying that students’ manipulative skills have become much weaker, Phillips 
continued with: “And the HCC curriculum makes a great virtue out of this 
necessity. By eliminating some of the symbolic manipulation from calculus, they 
were able to make the course more accessible to students.” This was written in a 
newsletter from the Harvard Calculus Consortium. 

The report from a committee looking at the future of the AP Calculus exam 
reads like a description of the Harvard Calculus book. This is a very poor idea. Let 
me explain why with an analogy. When my son was in high school, the precalculus 
book they used was'[3]. This was a very nice book, and he learned a lot from it. We 
could not use it for the corresponding course at the University of Wisconsin. Our 
course went about twice as fast, and the students who took it were in general not 
as good at mathematics as those who took this in high school. Many of the students 
who will use mathematics seriously are now taking calculus in high school. They 
need to develop technical skills beyond those of students who take calculus for the 
qualitative ideas there. The past AP Calculus exams were reasonable exams. My 
finals in first and second semester calculus were in general a bit harder than the 
AB and BC exams. A few years ago, there were a couple of students in calculus 
lectures who wanted to transfer to California Institute of Technology. Exams were 
given to these students, in chemistry, mathematics, and physics. The mathematics 
exams over a three year period were sent to me, to share with the students or for 
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them to take. They were a bit harder than the final exam in my second semester 
calculus course, as is appropriate. They should be at the level of exams for an 
honor section. The MAA has published translations of university entrance exams 
for some Japanese universities [11]. The one for students who want to study 
humanities at Tokyo University is harder than the sophomore placement exam for 
Cal. Tech. Thus, the level of the traditional AB and BC advanced placement 
exams was not too high. It is likely that the newer one will continue at an equally 
high level for a couple of years, but from then on it needs to be watched carefully. 
There are experienced high school teachers who feel the earlier exams were harder 
than those given in the last few years, and with a more qualitative exam it will be 
very easy to have the level slip as students who take the exam have less technical 
skill. 

The message from the Calculus Reform programs that is being heard is that 
students do not need to be able to do algebra well. The message from the NCTM 
Reform is that students do not need to know how to do arithmetical calculations 
well. Both of these messages are different from those sent by the countries that did 
best on TIMSS. See [2] and [9]. 

Technology has a place in mathematics instruction, but it needs to be used 
carefully. Until much more is learned about the drawbacks, it should not be 
pushed too much. In England, the results on the age 11 exams in the summer of 
1995 were so poor that six months later a decision was made that calculators were 
no longer to be used in one of the two maths exams. A teacher in Milwaukee 
talked at the Wisconsin Mathematics Council meeting in May 1995. He said that 
he was probably the first one to use calculators in high school in the Milwaukee 
area, and also the first to use graphing calculators. However, he now has some 
serious doubts about the wisdom of using them as much as he had. First, students 
do not know enough yet to profit from their use to the same extent that teachers 
could, with their greater knowledge. Second, he looked at books from 30 years 
before and found that many topics now in second year algebra were in first year 
algebra then, topics from second year algebra then are frequently done in precal- 
culus now, and quite a few things that were once done in precalculus are not done 
in high school now. He did not mentioned specific things, but conic sections comes 
to mind as something that is frequently not done now, either in precalculus or in 
calculus. The binomial theorem is another. Both belong somewhere, and high 
school seems the right place. Unfortunately, NCTM put conic sections down for 
decreased emphasis. Other countries do them and many other things that we do 
not do. 

We need to look seriously at what is being done in the rest of the world, to see 
what our students could learn if they had a good mathematics program. Then we 
need to develop one. The most important part of this is not calculus, but 
elementary school. However, what we do in calculus has an impact on the rest of 
our mathematics program, so we need to be very careful about what we do and 
how we talk about what we are doing. The medical advice of “First, do no harm” is 
good advice for us as well. 
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Personal items from the Monruty fifty years ago... 


Dr. Paul Erdés has been appointed to a research professorship at Syracuse 
University. 

Associate Professors Garrett Birkhoff and Saunders MacLane have been 
promoted to professorships. 

Professor Saunders MacLane of Harvard University has been appointed to 
a professorship at the University of Chicago. 

The following have received Guggenheim fellowship appointments: Pro- 
fessor Warren Ambrose of the Massachusetts Institute of Technology; 
Professor Garrett Birkhoff of Harvard University; Professor P. R. Halmos 
of the University of Chicago; Professor Saunders MacLane of the Univer- 
sity of Chicago; Professor A. H. Taub of the University of Washington. 


Associate Professor N. E. Steenrod of the University of Michigan has been 
appointed to an associate professorship at Princeton University. 


Dr. Irving Kaplansky at the University of Chicago has been promoted to 
an assistant professorship. 


Professor Antoni Zygmund of the University of Pennsylvania has been 
appointed to an associate professorship at the University of Oregon. 


Associate Professor Ivan Niven of Purdue University has been appointed 
to an associate professorship at the University of Oregon. 


Associate Professor Magnus R. Hestenes of the University of Chicago has 
been appointed to a professorship at the University of California at Los 
Angeles. 

... Vol. 54, 1947 
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The Fifty-Seventh William Lowell Putnam 
Mathematical Competition 


Leonard F. Klosinski, Gerald L. Alexanderson and Loren C. Larson 


The results of the Fifty-Seventh William Lowell Putnam Mathematical Competi- 
tion, held December 7, 1996, follow. They have been determined in accordance 
with the regulations governing the Competition, a contest supported by the 
William Lowell Putnam Prize Fund for the Promotion of Scholarship, a fund 
endowed by Mrs. Putnam in memory of her husband. The annual Competition is 
held under the auspices of the Mathematical Association of America. 

The first prize, $7,500, was awarded to the Department of Mathematics at Duke 
University. The members of the winning team were Andrew O. Dittmer, Robert R. 
Schneck, and Noam M. Shazeer; each was awarded a prize of $500. 

The second prize, $5,000, was awarded to the Department of Mathematics at 
Princeton University. The members of the winning team were Michael J. Gold- 
berg, Craig R. Helfgott, and Jacob A. Rasmussen; each was awarded a prize of 
$400. 

The third prize $3,000 was awarded to the Department of Mathematics at 
Harvard University. The members of the winning team were Chung-chieh Shan, 
Stephen S. Wang, and Hong Zhou; each was awarded a prize of $300. 

The fourth prize, $2,000, was awarded to the Department of Mathematics at 
Washington University, St. Louis. The members of the winning team were Mathew 
B. Crawford, Daniel K. Schepler, and Jade P. Vinson; each was awarded a prize of 
$200. 

The fifth prize, $1,000, was awarded to the Department of Mathematics at the 
California Institute of Technology. The members of the winning team were 
Christopher C. Chang, Hui Jin, and Hanhui Yuan; each was awarded a prize of 
$100. 

The six highest ranking individual contestants, in alphabetical order, were 
Jeremy L. Bem, Cornell University; Ioana Dumitriu, New York University; Robert 
D. Kleinberg, Cornell University; Dragos N. Oprea, Harvard University; Daniel K. 
Schepler, Washington University, St. Louis; and Stephen S. Wang, Harvard Uni- 
versity. Each of these has been designated a Putnam Fellow by the Mathematical 
Association of America and awarded a prize of $1,000 by the Putnam Prize Fund. 

The next four highest ranking contestants, in alphabetical order, were Federico 
Ardila, Massachusetts Institute of Technology; Michael R. Korn, Princeton Uni- 
versity; Ovidiu Savin, University of Pittsburgh; and Noam M. Shazeer, Duke 
University; each was awarded a prize of $500. 

The next five highest ranking contestants, in alphabetical order, were Andrew 
O. Dittmer, Duke University; Michael J. Goldberg, Princeton University; Samuel 
Grushevsky, Harvard University; Craig R. Helfgott, Princeton University; and 
Adam W. Meyerson, Massachusetts Institute of Technology; each was awarded a 
prize of $250. 

The next ten highest ranking contestants, in alphabetical order, were Pramod N. 
Achar, Massachusetts Institute of Technology; Constantin S. Chiscanu, Mas- 
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sachusetts Institute of Technology; Mike L. Develin, Harvard University; Andrei 
C. Gnepp, Harvard University; Hui Jin, California Institute of Technology; Carl D. 
Johnson, Georgia Institute of Technology; Amit Khetan, Massachusetts Institute 
of Technology; Alexandru A. Popa, Princeton University; Robert R. Schneck, 
Duke University; and Chung-chieh Shan, Harvard University; each was awarded a 
prize of $100. 

The following teams, named in alphabetical order, received honorable mention: 
the University of Chicago, with team members Nathan D. Broaddhus, Benjamin 
M. Cowan, and Christopher D. Jeris; the Massachusetts Institute of Technology, 
with team members Federico Ardila, Eric Kuo, and Adam W. Meyerson; New 
York University, with team members Aleksandr Bukharovici, Ioana Dumitriu, and 
Yevgeniy V. Kovchegov; Queen’s University, Ontario, with team members Joanna 
L. Karczmarek, Michael A. Levi, and Allan J. Roberts; and the University of 
Waterloo, with team members Jason P. Bell, Kevin Purbhoo, and Soroosh 
Yazdani. 

Honorable mention was achieved by the following twenty-eight individuals 
named in alphabetical order: Dan E. Angelescu, California Institute of Technol- 
ogy; Jason P. Bell, University of Waterloo; Nathan D. Broaddus, University of 
Chicago; Christopher C. Chang, California Institute of Technology; Donny C. 
Cheung, University of Waterloo; Patrick K. Corn, Harvard University; Jacob 
Eliosoff, McGill University; Galen B. Huntington, Reed College; Christopher D. 
Jeris, University of Chicago; Daniel B. Johnston, Washington University, St. Louis; 
Joanna L. Karczmarek, Queen’s University, Ontario; Travis J. Kopp, Stanford 
University; Francois Labelle, McGill University; Ondrej Lhotak, University of 
Waterloo; Tamas Nemeth, Macalester College; Jacob A. Rasmussen, Princeton 
University; Robert Ribciuc, Harvard University; Alex Saltman, Harvard University; 
Yuliy V. Sannikov, Princeton University; Naoki Sato, University of Toronto; Mark 
J. Tilford, California Institute of Technology; Jiri J. L. Vanicek, Harvard Univer- 
sity; Jade P. Vinson, Washington University, St. Louis; Michael J. Westover, 
California Institute of Technology; Eric G. Yeh, Harvard University; Jun Zhang, 
University of Utah; Hong Zhou, Harvard University; and Aleksey Zinger, Mas- 
sachusetts Institute of Technology. 

The other individuals who achieved ranks among the top 98, in alphabetical 
order of their schools, were: Biola University, Jeffrey J. Hatch; the University of 
British Columbia, Lawrence P. Tang; the California Institute of Technology, 
Hanhui Yuan; Case Western Reserve University, Neil A. Rubin; Columbia Univer- 
sity, Joseph A. Hundley; Cornell University, Harold O. Fox; Duke University, 
Johanna L. Miller; the University of Florida, Keith A. Grizzell; Hanover College, 
Navdeep Jaitly; Harvard University, Davin Chor, Samit Dasgupta, Dmitry L. 
Sagalovskiy, Scott R. Sheffield, Florin Spinu, Jonathan L. Weinstein; the Univer- 
sity of Illinois, Urbana-Champaign, Brad A. Friedman; Lebanon Valley College, 
Jason C. Lee; the University of Louisville, Marc J. Broering; Macalester College, 
David B. Castro; Marquette University, Scott P. Kempen; the Massachusetts 
Institute of Technology, Kelvin L. Cheung, Brian C. Dean, Miroslav Jurisic, 
Edward D. Lee, Michael R. Tehranchi, Benjamin D. Wieland; the University of 
New Brunswick, Fai K. Tam; New York University, Aleksandr Bukharovich; Ohio 
University, David Huebel; Princeton University, Andrew M. Neitzke; Queen’s 
College of the City University of New York, Daniil Khaykis; Rice University, Noah 
A. Rosenberg, Brian M. Wahlert; the University of Richmond, Ronald A. Walker; 
Rose-Hulman Institute of Technology, Jamie L. Kawabata; Stanford University, 
Robert G. Au; the University of Texas, Austin, An T. Nguyen; the University of 
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Texas, El Paso, Ricardo Alberto SAenz; Washington University, St. Louis, Mathew 
B. Crawford, Lawrence P. Roberts; and the University of Waterloo, Richard 
Hoshino, Derek I. E. Kisman, Alex Y. Lee, Kevin Purbhoo, Ian W. T. Vander- 
Burgh. 

The Elizabeth Lowell Putnam Prize, named for the wife of William Lowell 
Putnam and “awarded to a woman whose performance on the Competition has 
been deemed particularly meritorious,” is awarded this year to Ioana Dumitriu of 
New York University. The winner is awarded a prize of $500. 

There were 2,407 individual contestants from 408 colleges and universities in 
Canada and the United States in the competition of December 7, 1996. Teams 
were entered by 294 institutions. The Questions Committee for the fifty-seventh 
competition consisted of Mark I. Krusemeyer, Carleton College, chair; Richard K. 
Guy, University of Calgary; and Michael J. Larsen, University of Pennsylvania; 
they composed the problems and were most prominent among those suggesting 
solutions. 


PROBLEMS 


Problem A-1., 

Find the least number A such that for any two squares of combined area 1, a 
rectangle of area A exists such that the two squares can be packed into that 
rectangle (without the interiors of the squares overlapping). You may assume that 
the sides of the squares will be parallel to the sides of the rectangle. 


Problem A-2. 

Let C, and C, be circles whose centers are 10 units apart and whose radii are 1 
and 3. Find, with proof, the locus of all points M for which there exist points X on 
C, and Y on C, such that M is the midpoint of the line segment XY. 


Problem A-3. 

Suppose that each of twenty students has made a choice of anywhere from zero 
to six courses from a total of six courses offered. Prove or disprove: There are five 
students and two courses such that all five have chosen both courses or all five 
have chosen neither. 


Problem A-4. 
Let S be a set of ordered triples (a, b, c) of distinct elements of a finite set A. 
Suppose that: - 


1. (a,b,c) € S if and only if (b,c, a) € S, 

2. (a,b,c) € § if and only if (c, b, a) € S, 

3. (a,b,c) and (c,d, a) are both in S if and only if (b,c, d) and (d, a, b) are 
both in S. 


Prove that there exists a one-to-one function g: A — R such that g(a) < g(b) < 
g(c) implies (a, b,c) € S. 


Problem A-S. 
If p is a prime number greater than 3, and k = |2p/3], prove that the sum 
P\(P\a..4[P 
1 2 k 
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of binomial coefficients is divisible by p?. 


For example, at) 47) 4 (7) 27421 435 +35 =2-72. 
1 2 3 4 
Problem A-6. 
Let c => 0 be a constant. Give a complete description, with proof, of the set of 
all continuous functions f:R — R such that f(x) = f(x* + c) for all x ER. 


Problem B-1., 
Define a selfish set to be a set which has its own cardinality (number of 
elements) as an element. Find, with proof, the number of subsets of {1,2,..., n} 


which are minimal selfish sets, that is, selfish sets none of whose proper subsets 
are selfish. 


Problem B-2. 
Show that for every positive integer n, 
2n—1 ant+1 


2n —1\-5 2n+1)\->5 
<1-3-5-+:-(2n —1) < 
e 


Problem B-3. 
Given that {x,,x,,...,x,} = {1,2,...,}, find, with proof, the largest possible 
value, as a function of n (with n > 2), of 
XX FXyX34 Ft TX, 1X, +X,X}. 


Problem B-4. 
For any square matrix A, we can define sin A by the usual power series: 
- ( _ 1) ; 2n+1 


oe 
mn x (2n + 1)! 


Prove or disprove: There exists a 2 X 2 matrix A with real entries such that 


; 1 1996 
sin A = | 0 1 | 
Problem B-S. 

Given a finite string S of symbols X and O, we write AGS) for the number of 
X’sin S minus‘the number of O’s. For example, ALYOOXOOX) = —1. We call a 
string S balanced if every substring T of (consecutive symbols of) S has —2 < 
A(T) < 2. Thus, XOOXOOX is not balanced, since it contains the substring 
OOXOO. Find, with proof, the number of balanced strings of length n. 


Problem B-6. 

Let (a,,,), (a5, b,),...,(a,,5,) be the vertices of a convex polygon which 
contains the origin in its interior. Prove that there exist positive real numbers x 
and y such that 


(4), b,)x@y" + (ay, ba) xy + + (ay, b,) xy" = (0,0). 


SOLUTIONS. In the 12-tuples (711), 19, 2g, 7, Ng, Ns, 14, 14, Mo, N1,N9,N_,) fol- 
lowing each problem number below, n, for 10 >i > 0 is the number of students 
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among the top 206 contestants achieving i points for the problem and n_, is the 
number of those not submitting solutions. 


A-1 (87, 26, 47, 0, 0, 0, 0, 0, 6, 9, 31, 0) 

Answer. We can always accommodate the two squares in a rectangle of area 
A=(1 4+ y2)/2. 

Solution 1. Let x = cos 6, y = sin 0,0 < 6< 7/2. Then 


1 1 
x(x t+ = cos 0(cos 6 + sin 6) = V2 cos 6| —— cos 6 + ——= sin 0 


1 
= v2 cos 6 sin(7/4 + 0) = vm (sin(24 + 17/4) + sin(7/4)), 


which is maximized for 20+ 7/4 = 7/2. For this value of 0, x >y, so the 
maximum value we desire is (1 + sin(z/4)/ V2 = (1 + y2)/2. 


Solution 2. Write X =x and Y=ky for some positive constant k yet to be 
determined. Then 


,, AY ; xXx*+yY? 
x(x +y) =X +7 sx + 
x? +k*y? (2k +1)x* + k*y? 
ee 
2k 2k 


Now choose positive k so that 2k + 1 =k’, namely, k = 1 + V2. Then, 
k? (x? + y*) k 
2k 27 


For x =ky, X =/Y, the inequality is actually an equality, and x >y. So the 
maximum value is k/2 = (1 + y2)/2. 


Solution 3. Maximizing x(x + y) subject to x* + y* = 1 by Lagrange multipliers 
yields 


x(x+y)< 


2x +y=A-2x, x=A-2y, x? +y*=1. 
Multiplying the first equation by y and the second by x eventually yields 
x+y =xy2 (since x,y => 0). 
So y =x(V2 _ 1) <x, and from x? + y? = 1, we get 


1 
2 
e422” 
1 2 
x2(x +y)? = ——= - 2x? = ——__,, 
4 — 22 (4 — 2V2)° 
v2 1+ y2 
+y) = —— > = ——.. 


Solution 4. If A is the maximum value of x(x + y) subject to x* + y? = 1, then 
the hyperbola x(x + y) =A is tangent to the circle x? + y* = 1. The asymptotes 
of this hyperbola are the lines x = 0 and x + y = 0, so the line y = (tan(7/8))x is 
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an axis of symmetry of both the circle and the hyperbola and passes through the 
point of tangency. It follows that 


= sin(7r/8), x =cos(7/8), 
and thus, 


x(x + y) = cos*(7/8) + sin(7/8) cos( 7/8) 


1 1 : 
= 5 + cos(7/4)) + 5 sin( 7/4) — 


A-2 (6, 11, 27, 0, 0, 0, 0, 0, 65, 23, 46, 28) 


Solution. Let O,,O0, be the centers of C,,C, respectively, and let O be the 
midpoint of O,O,. Then the desired set is the closed annulus (ring) with center O, 
inner radius 1, and outer radius 2. 

To see this, note that if M is the midpoint of a line segment XY as described, 
then 


—> 1— 5, —> 1,— > >| > ——> 1,—5» —~> 
OM = 5 (OX + OY ) = (00; +0,X+00,+0,Y) = 5(O.X + O.Y ) 


— —__ Ss > —— — 
since OO, + OO,= 0. Now O,X can range over all vectors of length 1 and O,Y 
can range over all vectors of length 3. Therefore 3 —1< |O,X + O.Y | <3+1 
(by the triangle inequality), and it is easy to see that any vector of length between 2 
and 4 can indeed be obtained as the sum of a vector of length 1 and a vector of 

j . ; —>  in~. nw 

length 3. (Use the law of cosines if you like.) So OM= 1(O0,X + O,Y ) can be any 
vector of length between 1 and 2, and the result follows. 


A-3 (63, 18, 6, 0, 0, 0, 0, 0, 0, 0, 47, 72) 


Solution. The answer is “no”; there need be no such set of five students. 

Suppose that each of the 20 students chooses exactly 3 courses (and omits to 
take exactly 3 courses), but each does it in a different way. This is possible, since 
20 = °). Then, for each pair of courses, there will be exactly four students 
enrolled for both courses (their third choices will all be different—one of the 
4 = 6 — 2 courses other than the pair under consideration). No pair of courses will 
have five common enrollees. Correspondingly, for each pair of courses, there will 
be exactly four students enrolled for neither of the two, and no pair of courses 
will have five common absentees. 


A-4 (3,7, 7, 0,0, 0, 0, 0, 7, 19, 67, 96) 


Solution. Intuitively, one regards A as a subset of a circle and S as the set of 
triples in counterclockwise order. To obtain a linear order, we have to choose a 
Starting point. Fixing a, € A, we define a relation < on A by 


Gi) For all b # ay, ay < b. 
(ii) If ay, b, and c are all distinct, then b < c if and only if (a), b,c) € S. 


By (1) and (2), for all b # c, either b < c orc < b, but not both. By (1) and (3), 
b <candc <d implies b <d. Thus < gives A the structure of an ordered set. 
Defining g(a) = |{b € Alb < a}l, we see that g(a) < g(b) < g(c) implies a <b 
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and b<c; if a=a,, (a,b,c) €S by definition. Otherwise, (ay, a,b) € S, 
(a), b,c) € S, and the result follows from (1) and (3). 
A-5 (4, 4, 2, 0, 0,0, 0,0, 0, 13, 40, 143) 


Solution. Each binomial coefficient is divisible by p (since p divides the numerator 
and not the denominator of p!/(r!(p — r)!), so we wish to show that 


p-1 (p-1)(p- 2) ei PTD PTD 
2 2-3 2-3+k 


1+ 


is divisible by p. The terms are all integers and we express the sum as a sum of 


fractions whose numerators are multiples of p and whose denominators are prime 
to p. The sum is equal to 


PCy PC in PC, 


i 1 (-1)""" 
to 2 I nn ae 


veep 
2 3 k 


where the c, are integers and the final parenthesis is, when p = 6g + 1, k = 44, 
equal to 


y] 


' 1 5 1 1 
+o tote t+—-—2(—- 4+ —4+-+4+— 
3 4g 2 4 4q 
1 1 1 
= + + «-- + — 
2q +1 2q +2 4q 
1 1 1 1 
= |——_ + — + 
2q+1 4g 2q+2 4g-1 


+ se2 + 


1 1 
————__ + —— 
2q+q 4q-(q-1) 
= Pe + Pe + eee a 
(2¢+1)4q = (2¢ + 2)(4q — 1) 3q(3q + 1) 
and, when p = 6q + 5, k = 4q + 3, equal to 


' 1 1 1 {+ 1 
+o +o $e +——— -2/- + + 
2 3 4qg+3 2 4 4g+2 
1 1 1 
= + f+ eee + 
2q +2 2q +3 4g+3 
1 § 1 1 1 
= + ———} + + 
2q +2 4qg+3 2q +3 4q+2 


f+ se2 + 


1 1 
—————————- + 
2q+qt+2 aol] 
P P P 
a ee  __.,, 
(2q + 2)(4q + 3) (2q + 3)(4q + 2) (3q + 2)(3q + 3) 


A-6 (4, 4, 11, 0, 0, 0, 0, 0, 18, 13, 46, 110) 


Solution. We begin with the general observation that f(x) = f(x’? + c) = f(—x), 
so f is always even. Conversely, the even extension of any continuous function on 
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[0,0¢) satisfies the functional equation as long as the original function does. 
Therefore, we may and do restrict attention to x => O in everything that follows. 
We consider two cases. 


Casel: 0<c < 1/4. 

Here, x* —x +c=0 has positive zeros, a =(1—v1-—4c)/2 and b= 
(1 + v1 — 4c )/2. If 0 < x, < b and we define x,,, =x? + c, the monotonicity of 
x* +c on [0,%) implies that x), x,,... is monotonic (increasing for 0 < x, < a, 
decreasing for a <x, <b) and bounded, therefore convergent, and therefore 
convergent to a since the limit must satisfy L = L? + c. As 


f(%o) = f(a) = = lim f(x,) = f{ lim x,) = f(a), 


we have f(x) = f(a) for all x, 0 <x <b. 

If x) > b, the monotonicity of yx — c guarantees that x) > x) — c > b, so we 
can define recursively, x,,, = yx, — c. Again, the sequence (x,,) is bounded and 
monotonic; therefore it also has a limit, and this limit must be b. Then 


f(%o) =f(%1) = ++ = lim f(x,) =f lim x,) = f(6). 


As the range of f is finite and f is continuous, it is constant. 


Case 2: c > 1/4. 

Now, x > x* +c has no real fixed points. Setting tj = 0, t,,, =t2 +c, the 
sequence (t,) is monotonic, so if it didn’t go to infinity, it would have to converge to 
a (non-existent) fixed point. So each x = 0 is in some interval [t,, t,,.,]. 

Let g be any continuous function on the interval [0,c] such that g(c) = g(0). 


Define ¢(x) = vx — c and 


g(x) for x € [0,c] = [t,¢, | 
g((x)) forx €[c,c? +c] =[t,,6,] 
f(x) = g(6($(x))) for x € [t,, 6s], 


and in general 


8( o(6(-~(b(2)) -))) fore © [insta] 


A 
By construction, f(x) satisfies the desired functional equation. Continuity is 


obvious except at the points ¢,, where it follows from g(c) = g(Q). Conversely, 
every function f(x) is determined by its values on [0, c). 


B-1 (113, 72, 17, 0, 0, 0, 0, 0, 3, 1, 0, 0) 


Solution. The cardinality is the least member of a minimal selfish set, else there 
would be a proper selfish subset. But there is no other restriction. So their number 


| rie ba ete a ee 
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which, by induction, is the nth Fibonacci number, F’,, where F, = F, = 1, F,4, = 
F,+F,_.. 
B-2 (85, 13, 6, 0, 0, 0, 0, 0, 15, 7, 28, 52) 
Solution. Let M be the (natural) logarithm of the product “in the middle”: 
M = In(1-3-5---(2n — 1)) = In3 + In5 +--+ +In(2n — 1). 


If we take twice this quantity, we can interpret that as a Riemann sum for a 
definite integral of In x, and therefore we get estimates (using (a) partition points 
3,5,7,...,2n + 1, together with left-hand endpoints, and (b) partition points 
1,3,5,...,2n — 1, together with right-hand endpoints): 


2M < fw x dx and 2M > fw x dx. 
3 1 


So we have 


n+1 


[On xd <2M < [ In x dx 


[xInx —x]i"' <2M <[xInx—x]7”""' 


(2n — 1) In(2n — 1) — (2n—1) +1<2M < (2n + 1) In(2n + 1) 
—(2n + 1) — (31n3 — 3), 
so certainly 


(2n — 1) In(2n — 1) — (2n — 1) < 2M K< (2n + 1) In(2n + 1) — (2n + 1) 


2n — 1 2n—1 2n+1 2n+1 
5 In| |<< n( 


e e 


Taking exponentials now yields the desired result. 


B-3 (20, 19, 8, 0, 0, 0, 0, 0, 14, 36, 56, 53) 


Solution. Equivalently, we have to form an n-bead string out of beads labelled 
from 1 to n in order to maximize the sum of products of adjacent bead values. 
Suppose that for sone k <n, the set of beads 1 through & form a single connected 
chain but that bead & + 1 is not adjacent to any bead in that chain. Then there 
exists a connected chain of beads that contains all beads 1 through k, but does not 
contain bead k + 1, and is such that one end is adjacent to bead k + 1, while the 
other is one of the beads from 1 to k. By cutting out this chain and reversing its 
order, we obtain a new string with greater product sum. Thus, & + 1 must be 
adjacent to the string with beads 1 through k. In fact, it must be adjacent to the 
bead with the smaller value of the two end beads, since otherwise we could 
increase the product sum by cutting out the string with beads 1 through k and 
reversing the order. By induction on k, the string reads ...,7,5,3,1,2,4,6,8,.... 
We conclude by a standard summation that the answer is (2n° + 3n* — 11n + 
18)/6. 
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B-4 (40, 4, 7, 0, 0, 0, 0, 0, 17, 4, 48, 86) 


Solution. We'll show that there is no such matrix A. First of all, note that for any 
invertible matrix P and any square matrix B of the same size, 


(-1)" 
9 (2n + 1)! 


2nt+1 


sin(PBP-!) = ). ————(PBP™') 


| 
i 8 


= (sin BP 


Thus the sines of similar matrices are similar. 
Now any 2 X 2 matrix A with real entries is similar to either a diagonal matrix 
r . . . ; . 
0 i with real or complex entries 4,, A,, or a triangular matrix ( q with 
2 
real entries A, c (in fact, one can take c = 1). Therefore, sin A is similar to either 


. Ay 0 . A C ] . Ay 
in| 0 ‘| or sin| I But sin 


O\. ; oo, 
0 0 2 | is a diagonal matrix, since all powers of 
2 


A, 0 ; ; - oe a, ; 
0 i | are diagonal, and no diagonal matrix is similar to (; 1996), since the 
2 


latter is not diagonalizable. So if sin A = ( , 1996), there must be real numbers A 
1 1996\:. .: (rx 
and c such that ( 0 1 ] is similar to sin| 0 “|. 
Let U= (* “|. We compute sinU explicitly: we have U? = | \ ca) 
Ue = [* aoe and by induction U" = [*" rare), Therefore, 
0 A 0 A" 
inv =< 7 oD (-1)" [a2"*1 (2n + 1) a2"e 
(2n + 1)! 0 \2ntt 
y (-1)"\?"*1 (1 a" 
_ (2n + 1)! (2n)! - (sm coos A) 
—] 2nt+1 sin 
0 du (“yy mn 
(2n + 1)! 
1996 


|, the double eigenvalue sin A must equal 


0 1 1996) 
1 0 1 


For this matrix to be similar to ( , 1 


1. But then cos A = 0 and so sinU = (; 
after all, so we have a contradiction. 


), which is not similar to ( 


B-3 (34, 4, 7, 0, 0, 0, 0, 0, 3, 2, 58, 98) 


Solution. Balanced strings consist of X’s and O’s arranged alternately, or with as 
many as two consecutive letters of the same kind. Such occurrences of double 
letters must happen alternately, ... XX ...O0...XX..., with an even number 
(possibly zero) of single letters between each occurrence. If b, is the number of 
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balanced strings of length n, we will show that 
Biv = 2b,+ 2, ie.,b,,, +2 = 2(5, + 2). 


(When we have counted X, O, b, = 2; XX, XO, OX, OO, b, = 4; XXO, XOX, 
XOO, OXX, OXO, OOX, b; = 6; XXOX, XXOO, XOXX, XOXO, XOOX, OXXO, 
OXOX, OXOO, OOXX, OOXO, b, = 10; then we have more than enough informa- 
tion to deduce that b,, = 3-2" —2and b,,_, =2"*' — 2.) 

Let x,, y,, Z, denote, respectively, the number of balanced strings of length n 
that end with XX, the number that end with XO and whose last occurrence of a 
double letter was XX, and the number that end with XO and whose last 
occurrence of a double letter was OO. Then x,, y,, z, also denote, respectively, 
the number of balanced strings of length n that end with OO, with OX and whose 
last occurrence of a double letter was OO, and with OX and whose last occurrence 
of a double letter was XX. (We count the purely alternating strings XOX...OX 
and OXO... XO twice, once among the y, and once among the z,.) So, b, = 2x, 
+ 2y,+2z,—-2,0orb,+2=2%x%,+y,+2Z,). 

We can form balanced strings of length n + 2 from such strings of length n in 
exactly the following ways: 


i. ...XX; legal to add OO or OX, but nothing else; (2x,,); 
li. ...XO with last double letter XX; legal to add OX or XO, but nothing 
else; (2y,,); 
iii. XO with last double letter OO, legal to add XX or XO, but nothing else; 
(2z,). 


Similarly, with O and X interchanged throughout. Thus, b,,, + 2 = 4(x, + 
y, + Z,) = 2b, + 2). 


B-6 (0, 1, 9, 0, 0, 0, 0, 0, 2, 0, 23, 171) 


Solution. Let f(Z) = U,e%>)?, If Vf(xq, yo) = 0, then (e*°, e”) is a solution of 
the original vector equation. As Vf is a continuous vector field, to prove that it has 
a zero, it suffices to find a simple closed contour over which Vf has a nonzero 
winding number. We claim that any sufficiently large counterclockwise circle 
around the origin will do. Indeed, for some r > 0, 


max {v- (a,, b;)} = rllall, 
l 


SO 


n 


DVB) = YL e((a;, b;) -B) 


i i=1 


> rllolle" + (n — 1) infxe* = role"? — (n — 1) e. 
XxX 


For |lv|| = R > 0, this means that v- Vf(v) > 0, which means that the winding 
number of Vf(v) along a circular path of radius R is 1. 


Klosinski / Alexanderson: Larson: 

Department of Mathematics Department of Mathematics 
Santa Clara University St. Olaf College 

Santa Clara, CA 95053-0290 Northfield, MN 55057 
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NOTES 


Edited by Jimmie D. Lawson 


A Quadratic Trio 


Joseph Kupka 


Strictly speaking, the quadratic formula is unnecessary. One may always complete 
the square. We use and treasure the quadratic formula because it is often less 
tedious than completing the square. 

The purpose of this note is to treat the Maclaurin series of the function 


co 


Bx + y 


= —________- A,x", A,C#O0 
P(x) Ax? + Bx+C 7 n™ 


in the same spirit. We refer to f as the generating function for the sequence of 
coefficients A,. The standard technique for determining the A, is to decompose f 
into partial fractions and expand these into (possibly complex) geometric series. 
We use it, once and for all, to produce three formulas free of complex notation. 
Illustrations of this “quadratic trio” include calculations connected with the tossing 
of a coin with probability p for heads and g = 1 — p for tails. 


(i) If B? — 4AC > 0, then 
—(-— B+D)\" B-D B-D\" 
: e-) - ((e-) -#)[e-) 


" D 


(1) 
A B+D 

»( a | 6 
where D = VB* —4AC. Note: the (B+ D)/2C are not misprints of (B + 
D)/2A. The Fibonacci sequence (f) = 0, f, = 1, f, =f,_-1 + f,-2, n = 2), which 
has generating function F(x) = L7_,f,x" =x/( —x — x’), provides a straight- 
forward illustration of (1). Substituting A = B = —1,C = B=1, y=0, D= V5 
into (1) gives 


y] 


(-1)"| (-14+v5\" (-1-¥5\" 
5 (| (> : 


which simplifies at once to the more familiar form 
1 ](v5 +1) yee 5-1)" 
= — + — —__ . 
A sometimes simpler alternative to (1) is 


Oo) Anat (Z+el(Z) - 
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where r,, r, denote the roots of Ax* + Bx + C. For example, when p + q, the 
classical formula (1 + (g — p)")/2 for the probability of an even number of heads 
in n tosses of the coin is more easily obtained from the generating function 
(1 — qs)/( — s)\ — (q — p)s) by using (1*) instead of (1). 

Gi) If B* — 4AC = 0, then 


: or aye 


(2) A 


n A B (n+ 1)>() - na] 


This is obtained via the identity ( —x)~* = L°_,(m + Ix". 
(iii) If B* — 4AC < 0, and if A > 0, then 


(3) A, = = (4) 


A 1/2 
n= Die (| sin(n + 1)6+ Bsinné 


C 


y] 


where D = V4AC — B? and @=cos '(—B/2VAC ). When A <0, (3) may be 
used mutatis mutandis: specifically, multiply the right-hand side of (3) by —1 after 
changing @ to cos '(B/2VAC). When A > 0, a sometimes more convenient 
alternative to (3) is 


. 2R(A\"? — 
(3 ) A, = ated sin(né + W), 


where R = VAy* — BBy + CB*/VC and w= sgn(y)cos-'(( B — yB/2C)/R) 
under the convention sgn(x) = 1 if x > 0, and = —1 otherwise. For example, 
when p = 3/4, the formula 1 — 23/4)"*! + 2(/3 /4)"*! sin(n + 1)@ — 7/2), 
9= cos '(—1/ 73), for the probability of three heads in a row during n tosses of 
the coin comes a little more directly from the generating function p°s°/( — s) 
(1 — qs — pqs* — p’qs°) if one uses (3*). 

In heavy-duty applications, these formulas can be considerably less tedious than 
the standard technique. For example, the expression 


 (n —2i—2) 9349 n-2)- 
p= DI 5; |p**q 21-2 


represents the probability that, in 1 tosses of the coin, the number of heads is even 
—and, that the first pair of consecutive heads appears on the last two tosses. In 
this case with -p) = p, = 0, we have 


: = p’s(1 — 45) 
P(s) = Ss = ; 5: 
() he (1 — qs) - ( pqs”) 


A complete partial fractions decomposition of P(s) would be forbidding, but it is 
quite easy to express P(s) = p(P,(s) + P,(s))/2q, where P,(s) = (gs — 1)/( pqs? 
+ qs — 1) and P,(s) = (gs — 1)/(pqs* — gs + 1). The quadratic trio now provides 
>-free formulas for the p, with relatively little additional algebra to achieve 
simplest form. Only (1) is required for the case p < 1/5, (1) and (2) for p = 1/5, 
and (1) and (3) for p > 1/5. It is unexpected to find p = 1/5 as a changeover 
point between substantially different formulas for the p,. When p = 1/2, we have 

= 7/3, which causes the sequence {yV~A/C sin(n + 1)6+ B sin n6} = 
(1/2){sin((m — 1)7/3)} to “deconstruct” into the period-six sequence 
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(V3 /44—-1,0,1,1,0, —1, —1,0,...}. The overall result is 


2"?, = ; 
20 21 


1 n-1 n n-1 1 
me (5 +1) + (—1) (V5 — 1) )+ 5 e 


where ¢, = —1 if n = 0,5(mod6); «, = 0 if n = 1,4(mod6); and «, =1if n= 
2,3(mod 6). It is important to emphasize how easily and quickly this result is 
obtained from P,(s), P,(s), and the quadratic trio. 
In a completely different application of the quadratic trio, we obtain explicit 
formulas for the nth derivative of 
f(x) 1+x (x-—a)+(a+t+1) 
x) = ————_—_ = ————-——XOO I 
L+x+x°  (x-a)’ + (2a4+1)x+ (a? +a+41) 
where a is any constant. The drill is: Obtain the coefficient A, of (x — a)” in the 
Taylor series of f about a from, in this case, (3), change a into x, set 
g(x) = cos” '(—(2x + 1)/2Vx? +x +1), and multiply by n! to get 
-nj2| (4 + I)sin((nm + 1)g(x)) , 
f(x) = (2/V3 Jnl (x? +x 41) + sin(ng(x))]. 
( vx? +x4+1 
With (3*) in place of (3), we would get 
f(x) = (2/V3 Jnl (x? +x 4+ 1) sin(ng(x) + h(x)), 
where h(x) = sen(1 + x)cos-'((1 — x)/2Vx? +x+4+1). 


One may envisage a “cubic quartet” of formulas for the coefficients in the 
Maclaurin series of the ratio of a quadratic and a cubic. The quadratic trio would 
facilitate the derivation of such formulas. 


—(n+1)/2 


Department of Mathematics 
Monash University 

Clayton, Victoria 3168 
Australia 


A Discrete Form of the 
Beckman—Quarles Theorem 


Apoloniusz Tyszka 


The following theorem may be viewed (for n = 2) as a discrete form of the 
classical Beckman-—Quarles theorem, which states that any map from R” to R” 
(2 <n < ©) preserving unit distances is an isometry; see [1]. 


Theorem. /f x, y € R? and |x — y| is constructible by means of ruler and compass, 
then there exists a finite set 8, © R? containing x and y such that each map from Syy 
to IR’ preserving all unit distances preserves the distance between x and y. 


Proof: It is known that a segment can be constructed with the use of a ruler and a 
compass if and only if its length belongs to the real quadratic closure of the field of 
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rational numbers [5]. Let us denote by D the set of all non-negative numbers d 
with the following property: 


If x,y © R* and |x —y|=d then there exists a finite set S,, such that 
x,y © S,, and any map f:S,, > R* that preserves unit distance also pre- 
serves the distance between x and y. 


Obviously 0,1 € D. We first prove that if d © D then 3 -d € D. Let us assume 
that d > 0, x,y € R’, |x — y| = V3 -d. Using the notation of Figure 1, we show 
that 


Siy = U {Sys: a, € {x, y, x1, %2, ¥, ¥, 5}, 1a — bl = a} 
is adequate for the segment xy. 


Figure 1 


Let us assume that f: S,, > IR’ preserves the distance 1. Since 


Sy DSy5 USyy, USyy, USyy, US, x, US 


we conclude that f preserves the distances between y and y, x and x,, x and x,, 
y and x,, y and x,, and x, and x,. Hence |f(y) — f(y)| = d and | f(x) — f(y) is 
either 0 or 73 -d. Analogously we have that | f(x) — f(¥)| is either 0 or V3 -d. 
Thus f(x) # f(y), so |f(x) — f(y)| = v3 -d, which completes the proof that 
V3 -deED. 

If d € D, then 2-d € D (see Figure 2). 


~ 


y x 


“< 


* d 2 d 
Figure 2. 
lx —y|=2-d 
Sy = U {Sasi a,b € {x, y,%,5,z}, la- bl=dv la — b| = v3 -a} 
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From Figure 3 it is clear that if d © D then all distances k-d (k a positive 
integer) belong to D. 


d d d d d 
a 
Xx Xx woe XK] y 

Figure 3. 
Ix—yl=k-d 


Sry = 'S {S,,:4,5 © {x,y,x1,..-,X4-1}, la -bl=dv la — b| =2-d} 


From Figure 4 it is clear that if d € D, then all distances d/k (k a positive 
integer) belong to D. Hence D D Q*; here and subsequently @* denotes the set 
of positive rational numbers. 


(k—1)-d 


(k-1)-d 


ost 


Figure 4. 


x-yl=" 
x—yl= 7 


~ 


Sry = Sey U Sey U Sy, U Sz, U Ssy U Sy U S52 


If a,b © D, a > b, then Va* — b? € D (see Figure 5). Hence 


V2 -a= V (v3 -a) —@ ED and Va? +b’ = (v2 +a)’ — (va? = b?) ED 


y 
da a 
fe | 
S x 
Figure 5. 
Ix — y| = Va? — b? 


Sy = Syy U Sep U Sop U Syy USyy 
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The construction presented in Figure 6 shows that if a,b © D, a > b, then 
a—-beéeD,hencea+b=2-a-(a-b)ED. 


Figure 6. 
lIx-yl|=a-—b, 2:c<a+b 
Sry = Ss US), U S,, U Ss, U Sy, US, U S,, 


In order to prove that D \ {0} is a multiplicative group it remains to observe 
that if positive a,b,c € D then ab/c € D (See Figure 7). 


Figure 7. 
b<2-n-c 


Sap = Soa U Sop U S04 VU Sog VU S44 U Sop VU Sie 


Ifa €D,a>1, then Va =4-V(a+ 1) —-(a—-1) €D:ifaeD,0<a< 
1, then Va = 1/ ft € D. Thus D contains all non-negative real numbers con- 
tained in the quadratic closure of Q. 
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Remarks. One can easily see that if a distance d belongs to the set D defined in 
the first paragraph of our proof, then d is definable in R by some formula in the 
language of fields; see [2, pp. 35, 41] for formal definitions of these terms. From 
this, we can conclude that any such d must be an algebraic number [2, p. 197]. On 
the other hand, applying results from [7] and [8] (and from [9] for n > 2), we can 
prove that any algebraic distance belongs to D. 

In our proof we used some ideas of [6], and some ideas are based on construc- 
tions of Georg Mohr, who first proved that all Euclidean constructions can be 
carried out with compass alone; see [4] and [3]. Using these ideas we recently 
extended the theorem to constructible differences in R" (2 < n < ©) and multival- 
ued mappings. 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, Richard Pfiefer, Leonard Smiley, John Henry Steelman, 
Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before March 31, 1998; Additional information, such as generalizations and refer- 
ences, is welcome. The problem number and the solver’s name and address should 
appear on each solution. An acknowledgement will be sent only if a mailing label 
is provided. An asterisk (*) after the number of a problem or a part of a problem 
indicates that no solution is currently available. 


PROBLEMS 


10613. Proposed by F- J. Flanigan, San Jose State University, San Jose, CA. Fix a positive 
real number v. Find all polynomials P(x) with nonnegative real coefficients such that 

(a) P(O) = 0, P(1) = 1, and P(x) < x” for all x > 0. 

(b) P(O) = 0, P(1) = 1, and P(x) > x” for all x > 0. 


10614. Proposed by Grigore-Raul Tataru, University of Bucharest, Romania. Fix p > 1. 
Suppose that aj, a2,... 1s a sequence of positive real numbers such that AnAn+1a7 4a + 
An+2 — An = 0 for all n > 1. Show that {a,} is convergent. 


10615. Proposed by Joaquin Gémez Rey, Alcorcon, Madrid, Spain. For n a positive integer, 


evaluate 
ni —Dki 


Likithe te +ko'T | ape 


where the summation runs over all m-tuples (k,, k2,...,,) of nonnegative integers such 
thatk; + 2ko +---+nk, =n. 


10616. Proposed by Ernesto Bruno Cossi, Porto Alegra, Brazil. Let K be acompact, convex 
set in the plane. For each interior point P of K and each line/ through P, let A and B be the 
two points of / on the boundary of K, and let Q be the harmonic conjugate of P with respect 
to A and B. (That is, take Q to be collinear with A, P, and B sothat AP/PB = QA/QB.) 
If K is an ellipse, then for each P the locus of points Q is a straight line. Is the converse true? 


10617. Proposed by James G. Merickel, Philadelphia, PA. For a positive integer N, o (N) 
denotes the sum of the positive divisors of N. Given a positive integer n and a prime p, 
prove that there exist arbitrarily large sets S of multiples of n with the following property: 
For some positive integer m, the fraction 0(N)/WN reduces to a fraction whose denominator 
is p” for every N € S. 
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10618. Proposed by S. Lakshminarayanan, S. L. Shah, and K. Nandakumar, University of 
Alberta, Edmonton, Canada. Let A be areal m X n matrix of full rank with m < n and let 
b be areal m x 1 matrix. For 1 < i <n, define 


__ det(A*A7) — det(A; A?) 


wi det(AAT) 
where A; is obtained by replacing the ith column of A by b and A; is obtained by deleting 
the ith column of A. Show that x = [x1,..., X,]/ is a solution to the linear system Ax = b. 


10619.* Proposed by John Lawrence, Virginia Polytechnic Institute and State University, 
Blacksburg, VA. Let X and Y be independent and identically distributed random variables in 
R” with density f(x) given by g(||x||), where |!x|| 1s the Euclidean norm and g is a strictly 
decreasing function (i.e., f 1s spherically symmetric about the origin and unimodal). For 
fixed z € R", let h(z) denote the probability that the side joining X and Y is the longest side 
in a random triangle with vertices X, Y, and z. Must h be spherically symmetric about the 
origin and unimodal? 


SOLUTIONS 


Incircles of Curvilinear Triangles 


10368 [1994, 273]. Proposed by Emre Alkan, Bosphorus University, Istanbul, Turkey. For 
each point O on diameter AB of a circle, perform the following construction. Let the 
perpendicular to AB at O meet the circle at point P. Inscribe circles in the figures bounded 
by the circle and the lines AB and OP. Let the R and S be the points at which the two 
incircles to the curvilinear triangles AO P and BOP are tangent to the diameter AB. Show 
that 7RP'S is independent of the position of O. 


Solution I by A. N. ’t Woord, University of Technology, Eindhoven, The Netherlands. We 
prove that the line PR bisects ZAPO. Because the figure is symmetric, this implies that PS 
bisects ZOPB. Hence ZRPS = 1/4. 

A well known theorem of geometry says OP* = AO - OB, where the expression XY 
denotes the length of the segment with endpoints X and Y. By adding the square of OB we 


obtain 
BP? = OP*+OB2=AO-OB+OB:OB=AB: OB. (1) 


Let M be the center of the circle with diameter AB and F be the center of the circle 
touching AB,,O P and the arc AP. Then MF = AM — OR. Square both sides to see that 


RM? + O,.R* = MF? = (AM — OR)? = AM? —2AM-OR+ OR’. 


Hence 
AB-BR—BR*=AR-BR= (AM — RM)(AM + RM) = 


= AM? —~ RM? =2AM-OR=AB.-OR, 
and after rearrangement this yields 
BR? = AB(BR — OR) =AB.- OB. (2) 


Combining (1) and (2) gives BR = BP, so that PRB = LRPB. This implies that 
JRPO=n—LPRB=nx —ZRPB = ZAPR. 


Solution II by O. P. Lossers, University of Technology, Eindhoven, The Netherlands. The 
desired angle equals 45°, as we show in a slightly more general situation: 
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Proposition 1. Extend both AB and PO to infinite lines. Then the given circle T, the line 
AB, and the line PO split the plane into eight curvilinear (1n some cases infinite) triangles. 
Denote the points of contact of the "Incircles" (circles that touch all three boundary lines) of 
these curvilinear triangles with the line AB by Q, R, S, and T, in that order with A between 
QO and R. Then £QPR = ZRPS = LSPT = 45°, PR bisects ZAP O, and PS bisects 
LOPB. 


Proof. Let O’ denote the second point of intersection of PO and I’. Let A denote the circle 
that has PO’ as its diameter. The inversion with center P that maps O to O’ transforms the 
configuration {[, AB, OO’} into the configuration {/, A, O’O}, where / is the diameter 
of A that is parallel to the tangent to ’ at P. This latter configuration has two perpendicular 
symmetry axes through O, and the incircles of the corresponding partitions have tangent 
points with A that halve the corresponding arcs of A. Since inversion in P preserves angles 
and takes lines through P to themselves, the results follow. a 


This allows the following generalization. 


Proposition 2. Consider two circles meeting at a point P. Suppose that the common 
diameter of these circles meets one circle at point A and the other at O, where the segment 
AO meets the circles only at its endpoints. Inscribe a circle in the triangle whose sides are 
the line segment AO and the circular arcs AP and OP. Let the point where the inscribed 
circle touches AO be denoted by R. Then PR ts the angle bisector of APO 


Proof. Invert the configuration {1, A, O’O} of Proposition 1 about a point P of A not on 
O’O. O 


Editorial comment. Many solvers used methods based on trigonometry or coordinates, and 
several included figures. The selected solutions are among the more purely geometric, and 
the description of the construction allows the reader to construct a figure easily. 

Francisco Bellot Rosado included references to several other related problems, including 
MONTHLY problem 3887 [1938, 482; 1983, 486]. 


Solved also by P. M. Abe, M. Amengual (Spain), J. Anglesio (France), R. Barbara (Lebanon), F. Bellot Rosado (Spain), 
K. L. Bernstein, R. J. Chapman (U. K.), A. Coffman, R. G. Griswold, J.-P. Grivaux (France), R. Holzsager, D. J. Jones, 
H. Kappus (Switzerland), H. G. Killingbergtr@ (Norway), P. G. Kirmser, N. Komanda, M. Lehtinen (Finland), H. M. 
Marston, A. Nijenhuis, C. G. Petalas (Greece), M. Reid, I. A. Sakmar (Turkey), J. Sarkar, B. Shawyer (Canada), D. Tang, 
J. Vlachos (Greece), N. D Vo (Canada), M. Vowe (Switzerland), H. Weingarten, R. L. Young, NSA Problems Group, 
and the proposer. 


A Hidden Equilateral Triangle 


10378 [1994, 363]. Proposed by Bjorn Poonen, University of California, Berkeley, CA. 
Given that point’ D is in the interior of AABC and that there are real numbers a, b, c,d 
such that AB = ab, AC = ac, AD =ad, BC = bc, BD = bd, and CD = cd, prove that 
LABD + LACD = 12/3. 


Solution I by Jean Anglesio, Garches, France. We have 
DB AB b DC BC cc DA CA _a 


DC AC cc’ DA BA a’ ™ De CB ob 
Then D is one “isodynamic point” of the triangle ABC, that is, a point common to the three 
“Apollonius circles” of AA BC (see R. A. Johnson, Modern Geometry, Houghton Mifflin, 
1929). We recall that one Apollonius circle of AABC is the locus of points M such that 
MB/MC = AB/AC. The three Apollonius circles of AA BC have two points in common, 
called the isodynamic points of AABC. One of the two points is exterior to the circumcircle 
of AABC, Furthermore, if the other isodynamic point D is in the interior of AABC, then 
ZBDC = ZBAC+7/3, CDA = LCBA+27/3, and ZADB = ZACB4+ 27/3. 
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Now if E is the intersection of BD and AC, we have ZBDC = ZDCA+ ZCED and 
LCED = £BAC + ZDBA. Combining these, we obtain 


L£BAC+7/3 = ZDCA+ZBAC + ZDBA, 
from which the result follows. 


Solution IIT by O. P. Lossers, University of Technology, Eindhoven, The Netherlands. Em- 
bedding the figure in the complex plane, so that the points A, B, C and D correspond 
to the complex numbers a, £, y and 6, respectively, we observe that the three numbers 
(6—a)(B —y), (6 — B)(y —@) and (6 — y)(@ — B) all have the same modulus abcd. On the 
other hand, these three numbers add up to zero! Thus, they are the vertices of an equilateral 
triangle centered at the origin. In other words, we have 


(6-ay(B—-yv) _ @6-B\y-a) _ 6-y)a-B) _ (+2!) 
(6—B(y—a) (@—ye—B) b—aB—y) 3) 


Hence, 
b — 20 i — — 
“= — exp (+=) ys = exp (==) Y a 
y—B 3 }/y—-B 


Taking arguments, we get ADB = LACB+77/3. Since ADB > ZACB and the angle 
sum in quadrangle ADBC equals 27, the result follows. 


Solution III by C. Kenneth Fan, Harvard University, Cambridge, MA. Construct three tri- 
angles AByCg Da, AAgCg Dg, and AA, B, Dy, respectively similar to the three triangles 
ABCD, AACD, and AABD, with similarity ratios a, b, and c (so that each of the new 
triangles has a side of length abc). Label each subscripted vertex to correspond to its 
unsubscripted counterpart. 

By construction, 2 By DyCqg + LAgDgCgt LA, Dy, By, = 27. Furthermore, the lengths 
work out so that we may fit the constructed triangles together so that their interiors do 
not overlap and so that the edges By Dg and Ag Dg coincide, the edges Cg Dg and B, Dy 
coincide, and the edges A, D, and Cg Dg coincide. We also require that the vertices Dy, 
Dg, and D, coincide. The result is an equilateral triangle of side length abc. Thus, 


LABD+ LACD = LA,B,Dy, + LAgCgDz, = 2/3. 


Solution IV by Hans Georg Killingbergtré, Horten, Norway. From the figure A BC D, make 
a similar figure ACC’ D’, employing a scale of c : b and CAB as angle of rotation about 
A, whereby CD’ = CD and ZDCD’ = ZABD + ZACD. Since £D’AD = /CAB (the 
angle of rotation) and AD’/AD = c/b = AC/AB, AADD’ is similar to AABC with 
a scale of d : b. Hence DD’ = BC -d/b = cd = CD, which shows that ACDD’ is 
equilateral. 


Editorial comment. The Apollonius circles also appear in H. S. M. Coxeter, Introduction 
to Geometry, Second edition, Wiley, 1969, section 6.6, p. 88, and D. Pedoe, A Course of 
Geometry, Cambridge, 1970, section 18.3, p. 77. In both of these references, this concept 
is used to introduce inversive geometry. In this spirit, one notices that solution II interprets 
the conclusion as a complex cross ratio, which is the fundamental invariant of inversive 
geometry. Indeed, inversive geometry leads to yet another solution, which we sketch here. 
First, the concept of triangle is not preserved by inversion, so it needs to be replaced. The 
line AB can be considered as the circle through A, B and the point at infinity. Thus, the 
image of the triangle under an inversive transformation can be recovered from the images 
of four points: A, B, C and the point at infinity. Meanwhile, each Apollonius circle is 
characterized as the points P for which a certain cross ratio of P with the points A, B, and 
C is purely imaginary. Thus, the isodynamic points depend only on the locations of A, B, 
C and not on the point playing the role of the point at infinity. In particular, if A, B, and C 
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are the vertices of an equilateral triangle, the two isodynamic points must be the center of 
the triangle and the point at infinity, and the result is clear. The general result follows since 
there is an inversive transformation taking any three points to the vertices of an equilateral 
triangle. 

Another approach used Euler’s formula for the volume of a tetrahedron (see Heinrich 
Dorrie, 1/00 Great Problems of Elementary Mathematics, Dover, 1965, section 68) to express 
the condition that A, B, C, and D are coplanar, and the law of cosines to express the desired 
conclusion in terms of a, b, c, and d. It is then straightforward to relate the two conditions. 
Solved also by R. Barbara (Lebanon), J. C. Binz (Switzerland), R. J. Chapman (U. K.), A. Coffman, S. B. Ekhad, M. S. 


Klamkin (Canada), J. H. Lindsey II, M. Reid, N. D Vo (Canada), M. Vowe (Switzerland), NSA Problems Group, and 
the proposer. 


Some Volumes in Infinite Products of an Interval 


10402 [1994, 682]. Proposed by Werner Schindler, Universitat Regensburg, Regensburg, 
Germany. For every j € N the term A; denotes a copy of the Lebesgue measure on the 
unit interval [0, 1] while AN stands for the infinite product measure Qe 4; on [0, iN". The 
mapping f:[0, 1] — [0, 1]is assumed to be measurable. Its n-fold application fofo---of 
is abbreviated as f!”! and A(f) = { (xn) € (0, FN: x» < fl (x9) for each n > 1}. 

(a) Express AN(A(f)) as a limit of one-dimensional definite integrals. 

(b) Find a function f with AN(A(f)) = 1/2. 

(c) Compute AN(A(f)) for f(x) = e*7!. 


Solution by A. N. ’t Woord, University of Technology, Eindhoven, The Netherlands. (a) For 
m > 1, let 


Am(f) = { Gn) €[0, IN stn = fl" G0) for sn < ml. 
Then 
aN (Am(D) =f Foo fPMa)--- FC dx 
Since A(f) = (°°, Am(f), we get 
AN(A(f)) = Jim AN(Am(f)) = lim | "Fo Fle). Maar 


(b) Let f(x) = /x. Then f(x) flix)... flax) = x12") so 


1 l 


1 
N _ | = = 
r (AC) = fim, fx! "dx = wm, [a5 are =>: 


(c) For f (x) = e*~! we hav d = fll) de (x), and so by inducti aim = 
— = Tx , y induction —=;—(x) = 
f (x) flax)... fl" (x). Since eX > 1+ x with equality only for x = 0, f(x) > x with 
equality only for x = 1. Thus for x € [0, 1) the sequence (f I"l(x)) is increasing and 
bounded, so it has a limit L. Continuity of f implies that L is a fixed point, so L = 1. 


Therefore, 


AN(ACSf)) = jim, [ 


d fll 


dx = lim ra) _ f™@)) —1-1=0. 


Editorial comment. John H. Lindsey II gave a general form of the example of (b), using the 
same method to show that, for f(x) = x‘ withO <t < 1,AN(A(f)) =1-t. 


Solved also by €. Anderson, N. Bouzar, R. J. Chapman (U. K.), W. Hensgen (Germany), V. Hernandez (Spain), 
R. Holzsager, J. H. Lindsey II, R. Stong, WMC Problems Group, and the proposer. 


1997] PROBLEMS AND SOLUTIONS 771 


Relatively Prime Sines and Cosines 


10409* [1994, 794]. Proposed by J. van de Lune, Centrum voor Wiskunde en Informatica, 
Amsterdam, The Netherlands. Let p,, p2, p3, .. .denote the prime numbers in increasing or- 
der, and define S; = { t ER: sin (t log(px) ) > 0 ae = {t ER: cos (t log(px) ) > 0 \, 
Sr = WR Sk, and Cy = O,_, Cx. Prove (or disprove) that the relative measure of S; and 
C* (in R) is equal to 2~”. More precisely, prove (or disprove) that 


1 
lim —A(S* Q[-7T,T])=27” 
sim, ap Sn OKT TI) 
and the corresponding statement for C;, where 4 denotes Lebesgue measure. 


Solution by National Security Agency Problems Group, Fort Meade, MD. First, we fix some 
notation and introduce a basic definition. For a subset J of [0, 1]”, c7 denotes the indicator 
function of J; that is, cy (x) = 1 or O depending on whether or not x isin J. 

Let w map R into [0, 1]”. The function values w(t) are said to be C-uniformly distributed 
modulo 1 if for every parallelepiped J C [0, 1]” we have 


T 
lim ; | cy(w(t)dt = | cy (x) dx. (1) 
T>0o T 0 (0, 1)” 
A theorem of Kronecker says: If a1, a2, ..., @, are real numbers that are linearly indepen- 
dent over Q, then the function values of @(t) = (ta), ta2..., ta,) with each component 
reduced modulo 1 are C-uniformly distributed modulo 1. 
Kronecker’s theorem applies to @1, @2,...,@,, where aj = 5 log p; for distinct prime 


numbers p1, p2,.--, Pn, because if )>;_, (a;/b;) log p; = 0, then [];_, pail? = 1, which 
implies aj /b; = 0 for all i. 
With $(t) so defined and with J = (0, 4) x (0, 4) x --- x (0, 4) € [0, 1)”, (1) yields 


T— oo 


T 
lim ; | ci(o)at = | cy(x)dx =2™". (2) 
T Jo (0,1]" 


Now, ci((t)) = 1 ifand only if there are integers m; such that27m,; < tlog pj < 2m™mj;+ 
x fori = 1,2,...,n. Put another way, cy((t) ) = 1 if and only if sin (t log p; ) > 0 for 
alli, i.e.,¢ € S*. Therefore (2) yields 


1 1 ft 
lim —A(S*O[0,7])= lim — t =27". 
Too T ( " !) Too T | c(H( ))at 2 (3) 
A similar argument with J replaced by —J = ('4, 1) x --- x (44, 1) € [0, 1]” shows that 
1 
lim —a(S*A[-T,0]) =27" 
jim, = (St O[-T, 0]) (4) 


and combining (3) and (4) yields the claim about the relative measure of S*. 

The case of Cy can be treated analogously. Here we replace @(t) by w(t) = @(t) + 
(%4,..., 44), where as usual the additions are modulo 1. By the translation invariance of A, 
the values of w(t) are also C-uniformly distributed modulo 1. With J as before, one finds 
Cy ( w(t) ) = | if and only if cos (log Pi ) > 0 for all 7. Now, (1) yields 


lim “(CE ALO T]) =2 5 
Too T " 7 , ©) 


Since the cosine is an even function, it is clear that A(C* [0, T]) = A(C* N[-T, 0]), 
which completes the proof of the claim about the relative measure of the C;. 

A reference for Kronecker’s theorem and the basic properties of uniform distribution 
modulo 1 is Edmund Hlawka, The Theory of Uniform Distribution, A B Academic Publish- 
ers, Berkhamsted, U. K., 1984. 
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Editorial comment. Robin Chapman’s solution followed the proof of Weyl’s Equidistribution 
Theorem in T. W. Korner, Fourier Analysis, Cambridge, 1988. 


Solved also by R. J. Chapman (U. K.), J. J. Dai, and J. Lauret (Argentina, n=2 only). 
Divisors of the Falling Factorial 


10431 [1995, 169]. Proposed by Yury J. Ionin, Central Michigan University, Mt. Pleasant, 
MI. For positive integers n and s with n > s, the falling factorial (n), is defined as 
n'!/(n — s)!. Let d(n, s) denote the greatest common divisor of the falling factorials (n); 
and (n+ s),;. Prove that d(n, s) | (25 — 1)| 45/3]. 


Solution by A. N. ’t Woord, University of Technology, Eindhoven, The Netherlands. For 
each prime p, let vp(x) be the largest integer k such that p* divides x. It is sufficient to 
prove that up(d(n, s)) < vp((2s — 1)\45/3)) for every prime p. At most one integer in 
{n—s+1,...,n +5} has a divisor p* > 2s. If such an integer exists and exceeds n, let 
m = (n),; otherwise, let m = (n+5)s. In either case, 


vp(d(n,s) ) < vp(m) < =| + =| 2 ie =. 


where p’ < 2s < p't!. Also, 
2s — 1 2s —1 
ra) (222+) 
Is — _ _ _ 
7 (| s — [45/3 | =| ted |= 45/3) |), 
p p 


Pp 
Thus it is enough to prove that, fora < 2s, 


p=|=—|-[2]2[=-Se'|-k 
a a a 


Write s = ga+r with g = |s/a]. Then L = g + [(2r —1)/a]| — [r/a], and R 
g+lir—1- |s/3])/a]. For0 < 2r <a, we haver —1-— |s/3] <0,so3r <a+r< 
ga+rz=s. Consequently, L = q—1> R. For 2r > a, we find L = q > R. Therefore, 
L > R in all cases. 


Solved also by J. H. Lindsey II, L. E. Mattics, and the proposer. 


The Domain of Permissible Pairings of First and Second Moments 


10505 [1996, 72}. Proposed by Fu-Chen Chang, National Sun Yat-Sen University, Kaohsi- 
ung, Taiwan. Fora, b € R witha < bandn € N withn > 2 let 


1 n ] n 
Sn(a, b) = ma. 7 my = — ) Xi m2 = yah 
i=] i=] 


with {x,,..., X,} ranging over all possible samples of n numbers in the interval [a, b]. Find 
the area A,,(a, b) of S, (a, b). 


Solution by Richard Holzsager, American University, Washington, DC. Shifting the interval 
does not change the area and stretching it by a factor r multiplies the area by r? (since the 
width of S, gets multiplied by r and the height by r7). Therefore, it suffices to look at a 
convenient interval and then deduce the general result. Choose a = 0, b = n. For any 
O<x <n,ifm,; =x, thenm) = x74 Var(x|,...,Xn) varies from a minimum of x” to 
some maximum. The maximum occurs when the variance is greatest. This happens when 
all but at most one of the x;’s are at endpoints; this is easy to see, because if two are interior, 
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then raising the larger and lowering the smaller by the same amount increases the variance. 
So, ifi < x <i+1, then 0 occurs n —i — 1 times, n occurs i times, and the remaining term 
is n(x —i). Then m2 turns out to be ni + n(x — i)?. Subtracting x? and integrating from i 
toi +1 gives (n — 1)/3+(n—1)i —i*. Summing fromi = 0 ton — 1 and simplifying, we 
get (n>? —n*)/6. The answer for (a, b) is obtained by multipling by the cube of (b — a)/n, 
giving (1 — 1/n)(b — a)°/6. 

Editorial comment. Several solvers noted the geometrical formulation of this problem and 
its solution: Picture the cube [a, b]” in R”. (Itis easiest to do this in three dimensions!) The 
value of m, is constant along hyperplanes perpendicular to the main diagonal of this cube 
(the one that passes through the origin). The value of m2 is constant on spherical surfaces, 
because it is related to distance from the origin. The minimum value of m2 on a hyperplane 
section within the cube is therefore on the diagonal, because this is the closest point to the 
origin; the maximum value is on an edge of the cube, since such points of a cross-section 
are farthest from the origin. 

Now picture a hyperplane sweeping through the cube from nearest corner (to the origin) 
to farthest corner (from the origin), remaining always perpendicular to the main diagonal of 
the cube. At first, this hyperplane intersects the n edges emanating from the lower vertex; 
then at a certain point it contains a set of n vertices; then for a while it intersects certain 
edges till once again it contains a set of vertices; this pattern continues until the upper vertex 
is reached. Thus there are n episodes, each involving passing from intersection with a set of 
vertices through intersection with a set of edges to intersection again with a set of vertices. 
Within each episode the behavior of m2 is parabolic. On the other hand, the behavior of m 
is parabolic through the entire operation. 

J. H. B. Kemperman referred to related results included in his paper: Moment problems 

for sampling without replacement, Indagationes Mathematicae 76 (1973) 181-188. 
Solved also by R. A. Agnew, M. Benedicty, R. J. Chapman (U. K.), D. L. Farnsworth, M. Hoffman, J. H. B. Kemperman, 
M. S. Klamkin (Canada), J. H. Lindsey I, R. Richberg (Germany), K. Schilling, R. Stong, A. Tissier (France), T. V. Trif 
(Romania), A. N. ’t Woord (The Netherlands), GCHQ Problems Group (U. K., two solutions), WMC Problems Group, 
USA Problems Group, and the proposer. 


An Application of Crofton’s Theorem 


10512 [1996, 267]. Proposed by V. A. Zalgaller, Steklov Mathematical Institute, Russian 
Academy of Sciences, St. Petersburg, Russia. Let Q; and Q2 be compact subsets of the 
Cartesian half-plane y > 0. Assume that both Q; and Q> contain points with y > 0. Let 
®,; = Conv(Q, U Q2) and 1; = Len(d®,). Let 7(Q) denote the set obtained by reflecting 
the set O in the x-axis. Let ®2 = Conv(Q; UJI(Q2)) and lz = Len(d®2). Prove [2 > 11. 


Solution by Sung Soo Kim, Hanyang University, Ansan, Kyunggi, Korea. Parameterize the 
lines in the plane by the pairs (r, 0), where —2/2 <0 < 1/2 is the angle between the line 
and the horizontal, and r is the distance of the line from the origin. Crofton’s Theorem (see 
for example L. A. Santaldé, Integral Geometry and Geometric Probability, Addison-Wesley, 
1976) says that for any compact convex set Q, the perimeter of Q is twice the area of the 
set of (r, 8) corresponding to lines that intersect Q. Let A be the set of lines that intersect 
®, and B be the set of lines that intersect ®2. We have to prove that B has greater area than 
A, or equivalently that B — A has greater area than A — B. 

Let! € A —B. Then either (a) / is horizontal, intersects Q2, and lies above Q), or 
(b) / is a slanted line with Q; and I(Q2) on one side, but Q>2 not entirely on that side. In 
either case, J(/) isin B — A. Therefore 7(A — B) C B —A. On the other hand, if / is 
horizontal, and just below the x-axis, then / is in B — A and J(/) is in B (and so not in 
A — B). In fact, the same is true of all lines in a sufficiently small neighborhood U of /, so 
I(A—B)C B—A-—U. Since I preserves area, we are done. 


Solved also by G. Pete (Hungary), R. Stong, the GCHQ Problem Group (U. K.), and the proposer. 
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A Regular Hexagon Emerging from a Triangle 


10514 [1996, 267]. Proposed by Jiro Fukuta, Shinsei-cho, Motosu-gun, Gifu-ken, Japan. 
In AABC, Let P; and P2, P3 and P4, Ps and Pe be the points on the sides BC, CA, AB 
respectively, such that 


IBPil _ |CPal _ ICP3| _ |APal _ |APs| _ |B Pol _ 
[PiC| = [P2B) |P3A| |PaCl| =| PsBl | Pe Al 


with O <r <1. Let A’, B’, C’ be the points of intersection of P; P4 and P) Ps, P3P¢ and 
P4P,, PsP. and Pe P3, respectively. Let QO; P; P}+1,i = 1,..., 6 be the equilateral triangles 
built outwards on the sides of the hexagon P; P2... Ps. Let Rj Qj-| Qj41,i = 1,...,6 be 
the equilateral triangles built outwards on the diagonals of the hexagon Q;Q2... Q¢. 

(a) Show that the points Q;, A’, and Qg lie on R, Raq. 

(b) Show that the diagonals R; R4, R2Rs5, and R4R¢6 are concurrent and equal in length, and 
that the angle of intersection of any two of these lines is 60°. 

(c) Let G; be the centroid of the triangle R;~| R;R;41,i = 1,...,6. Show that G,...G¢ 
is a regular hexagon and that its center coincides with the centroid of the triangle ABC. 


Solution by Robin Chapman, University of Exeter, Exeter, U. K. Consider the triangle as 
embedded in the complex plane, and assume that A, B, and C appear in that order as the 
perimeter is traversed in an anticlockwise direction. Let the complex numbers (r + l)a, 
(r + 1)B, and (r + 1)y correspond to A, B, and C, respectively. Then P} = B+ ry, 
Py =rBt+y,P3=ytra,Pya=ryt+a,P5=a+rp, Po =ra+ B. The quadrilateral 
AP,A' Ps is a parallelogram and so A’ = Py + Ps -—A = (l—r)a+rB+ry. Note 
that if XYZ is an equilateral triangle labelled anticlockwise, then Z = ¢X + CY where 
¢ = exp(wi/3) = (1 + /—3)/2. Hence Q; = ¢P; + €Pj4, and Rj = €Q;-, + €Qi41. 
In particular, 


OQi=(6+r0Q)B+(ro4+0)y, AQs=atrept+rcy, 
Rp=—rat(¢+r)B+(C+ny, and Ry=(Q—nNat+( +nNB+ (62 4+ny. 


It follows that Ra — Ry = 2(a +28 + ¢2y). Similarly, Rg — R3 = 2(B +2 -y +¢20) = 
¢2(R4 — Ry) and Ry — Rs = 2(y +€-a +62) = & (Rs — Rj). Hence the line segments 
R, Ra, R3 Re, and Rs R>2 all have the same length and make angles z /3 with each other. Also 
A’=rQ;+(1 —1r)Q4, Ri = (1 +1r)Q1 —rQq, and Ry = (r — 1)Q1 + (2-1) Qa, so 
QO 1, O4, A’, Rj, and Rg are all collinear. 

Now 3G; = Rj-1+R;+Rj4+1; in particular 3G, = (r—l)a+(r +14-20)8+(r+1420)y 
and 3G) = (26 -l+r)a+(2¢ —14+r)B+(G+r)y. If H is the centroid of AABC 
then 3H =(r+1)(¢@+f6+/y). It follows thatG,; — H = (2/3)(-a+ ¢B + Cy) and 
G2-H = (2/3)(E ox + c7B + y) = ¢(G,; — A). Extending this computation gives 
G,; — H = ¢'—!(G, — B) for all i, and so the G; are the vertices of a regular hexagon with 
centre 77. 

It remains to show only that the lines Rj R4, R2Rs5, and R3 Re are concurrent. The points 
Rs, R,, and R3 are the third vertices of the equilateral triangles constructed outwards on the 
sides of AQ204Q¢6. The lines Q2R5, Q4R, and Q¢R3 are then concurrent in the Fermat 
point of AQ2 Q4Q¢ (see §4.2 of H. S. M. Coxeter & S. L. Greitzer, Geometry Revisited, 
Mathematical Association of America, 1967). But since Q2 lies on the line R» Rs, the line 
RRs is the line Q2 Rs and so on. Hence the lines R2R5, R4Rj, and Re R3 are concurrent. 


Solved also by M. Benedicty, Z. Cerin (Croatia), J. Duncan, N. Komanda, O. P. Lossers (The Netherlands), G. Pete 
(Hungary), R. Simon (Chile), T. Trif (Romania), R. Young, the Con Amore Problem Group (Denmark), and the proposer. 
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Splitting a Random Subset of the Ball 


10518 [1996, 347]. Proposed by Yunan Diao, Kennesaw State College, Marietta, GA. Let 
X be a finite set of points in a metric space and let X; and X2 be a partition of X into two 
disjoint nonempty subsets. Let d(X;, X2) = min{d(x1, x2) : x; € Xy, x2 € X2} be called 
the distance between the subsets, and let the largest value of the distance between two such 
subsets be called the splitting number of X. 

If X consists of n random points, independently selected from the uniform distribution 
on a ball of radius 1 in 3-dimensional Euclidean space, show that the splitting number of 
X is almost surely small. More precisely, for a < 1, show that there is a constant a > 0 
depending only on a such that the splitting number of X is less than a with probability at 
least 1 — e~®". 


Solution by Jonathan Pillow, University of Arizona, Tucson, AZ. Let S, denote the splitting 
number of the n points in the set X. Given a < 1, create a 3-dimensional cubic grid, 
spaced a//6 ona side, and intersect it with a ball of radius 1 in R®, partitioning the ball into 
m open regions, A;, A2,..., Am, and some boundary points (the grid planes and the surface 
of the sphere). Note that, if the ball is placed tangent to a grid plane in each direction, then 
m < [2./6/a]>. Also note that, for any two points x and y in regions A; and Aj; that are 
adjacent (along a face), we have d(x, y) < a. 

The probability that X contains a boundary point is 0, so assume that X C |)’, Aj. If 
A;  X is nonempty for all regions A;, 1 <i < m, then we must have S, < a, because for 
every nontrivial partition of X into sets X; and X2, there must exist x € X,; and y € X> 
such that x and y are in adjacent regions, implying d(X,, X2) < a. Thus, S, > a implies 
that at least one region A; contains no point of X, so 


m 
P(Sy >a) < PGI A;INX =D) <)> P(A;NX =H). 
i=! 
Now, the probability that A; contains no points of X is (1 — 3|A;|/477)”, where |A;| is the 
volume of A;. Taking A, to be the smallest Aj, we get P(S, > a) < m(1 — 3|A1|/477)”. 
We now argue that P(S, > a) < e~*”, where q@ is derived in the following manner. 
Observe that, if n > no = (47/|A1]|) log m, then 


3\A 
log P(S, > a) < logm+nlog (1 — ; ) 
1 


3n|A 3\A | 2\A 
< logm — n| Haan (* Hem) en (4 1) = an 


An An n An 


so P(S, > a) < e~™", On the other hand, ifn < ng, note that P(S, > a) < e" 8 Po)/n0 = 
e 2", where Po = Maxn<ny P(Spn = a). It follows that P(S, <a) > 1 —e~®” for all n, 
where @ = min(q@, @2). 

Solver’s note: Better estimates for a may be obtained if greater care is used in partitioning 
the ball into regions. If we consider points randomly distributed in [0, 1], then the analysis 
is much simpler and we can easily show that a > a3/2./6. We may also let a vary with n. 
Jessica Sidman of Scripps College and I considered such questions at the Michigan Tech 
REU program, and we proved that, for points chosen from [0, 1], 


l 
lim p (= Ogn s, = (Ret) a1 
n— Oo 


Solved also by J. H. Lindsey IJ, R. Stong, and the proposer. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


An Introduction to Difference Equations, by Saber Elyadi. Springer-Verlag, New York, 
1996, 336 pp, $45. 


Reviewed by Ronald E. Mickens 


The Newtonian revolution in physics led to the use of differential equations as the 
fundamental basis for the mathematical modeling of dynamical systems. Paradoxi- 
cally, the great success of this paradigm also led to the widespread use of 
difference equations as discrete models of differential equations, primarily for the 
purposes of numerical integration of these equations. More recently, the easy 
access to digital computers provided a strong incentive for their users to have a 
knowledge of at least the elementary properties of difference equations. Modern 
books on the subject began to appear in the 1950’s. One of the first and most 
popular of them was written by Goldberg [3]. It gave a general introduction to 
linear difference equations and applied them to the formulation and solution of 
problems in economics, psychology, and sociology. During the decade of the 1960’s 
several books appeared for the mathematically more sophisticated reader, for 
example those by Levy and Lessman [10], Brand [2], Hildebrand [4], and Miller 
[14]. 

The past ten years has seen the arrival of books that go beyond just giving an 
introduction to the various elementary properties of difference equations. Collec- 
tively [1, 5, 6, 7, 8, 9, 12, 13, 15], they consider the generalization to difference 
equations of many of the techniques usually associated with linear and nonlinear 
differential equations. They raise questions, and provide (partial) answers, to 
questions such as 


¢ How can the asymptotic behavior of the solutions be determined? 

¢ For diffefence equations with a nonlinear term multiplied by a “small” 
parameter, can,perturbation methods be devised to provide uniform approxi- 
mations to the solution? 

¢ For difference equations, what corresponds to the Hopf bifurcation theorem? 

¢ Can “exact” difference equation models of differential equations be con- 
structed? 

¢ How can accurate numerical solutions be determined computationally for all 
the solutions of a given linear difference equation? 


The recent blossoming of research in difference equations and their application 
to problems in nonlinear discrete dynamics led Gerry Ladas and Saber Elaydi to 
found the Journal of Difference Equations and Applications (published by Gordon 
and Breach), whose first issue was published in January 1995. A major reason for 
this new journal was the need for a place where researchers in difference 
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equations could publish excellent papers on their results and have them read by 
colleagues in the field. Another factor was the lack of any mathematics journal that 
emphasized work in difference equations. In the past, researchers in this area 
usually published papers in journals devoted mainly to differential equations or in 
journals dealing with specific discipline-based applications. 

Much of the current interest in difference equations within the general scientific 
community had its genesis in a review paper written by Robert May in 1976 [11]. 
He did an analytical and computational investigation of the properties of the 
solutions to 


Xpa, =AX,(1 -—x,), AZSM1. (1) 


This equation can be considered a discrete model for the growth of a single 
population using a finite-difference approximation to the so-called logistic differ- 
ential equation 

” 1 2 
That is, if y, is an approximation to y(t,), where t, = hk with At = h, then using 
a forward Euler approximation for the derivative gives the scheme 


Ve+1 — Vk 
— h = y,(1 — Vx): 


The substitution 


h 
x, = [| A=1+A, 
leads to (1). Now for positive initial data, y(O) = y, < 0, all the solutions of (2) go 
monotonically to the value y = 1. However, it is rather easy to demonstrate, as 
May [11] does, that (1) exhibits a variety of solution behaviors (periodic, chaotic, 
etc.) depending on the value selected for A. 

While (1) has a simple mathematical form, it turns out that many phenomena in 
the sciences can be modeled by simple difference equations whose solutions 
describe complex dynamical behavior [8, 9, 15]. A major advantage of discrete 
models for such systems is that, while the equations of motion cannot, in general, 
be solved analytically in terms of elementary functions, the solutions can be 
determined easily with the aid of digital computers. The results can then be 
displayed in a variety of visual formats for study and analysis. These facts have led 
many colleges and universities to create courses on discrete dynamical systems 
and/or difference equations. One consequence of these new courses was the 
publication of several books that could be used as introductions to discrete 
dynamical systems and difference equations or to more advanced topics in the 
theory of difference equations. The book under review combines both of these 
features. 

The author, Saber Elyadi, is Professor of Mathematics at Trinity University in 
San Antonio, Texas. He has made important contributions in difference equations, 
especially in the areas of the discrete Levinson’s theorem and the theory of 
Volterra difference equations. His book consists of eight chapters and is based on 
a course that he teaches at Trinity. The students in the course are upper-level 
undergraduates and come largely from mathematics and the physical and engineer- 
ing sciences. The background required of the students is rather minimal, calculus 
and linear algebra, but certain topics from advanced calculus are needed for 
material near the end of the book. 
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The first three chapters introduce the reader to the fundamental concepts 
needed to understand both linear and nonlinear difference equations. In particu- 
lar, the author does an excellent job in his presentation of the criteria for the 
asymptotic stability of fixed points (Chapter 1) and in its generalization in Chapter 
4, where he discusses determination of stability by both linear approximation and 
Liapunov’s second method. 

Chapter 5 gives a through discussion of the Z-transform method, which serves 
exactly the same function for difference equations as the Laplace transform does 
for differential equations. In addition to showing how this technique can be used 
to solve linear difference equations, the author applies it to the scalar case of 
Volterra difference equations of convolution type. 

Chapter 6 is on control theory. The chapter considers only time-invariant 
(autonomous) discrete systems and covers the basic concepts needed for an 
introduction to this topic: controllability, observability, and stabilizability by feed- 
back. Chapter 7 gives an excellent introduction to various techniques that can be 
applied to determine the asymptotic behavior of solutions to both linear and 
nonlinear difference equations. In addition to discussing the well-known theorems 
of Poincaré and Perron, the author includes some of his own recently discovered 
results in this area. The final Chapter 8 is on oscillation theory. Let a nontrivial 
solution to a difference equation be denoted by x,. The solution is said to be 
oscillatory (around zero) if for every positive integer N there exists k => N such that 
X,X,41 <0. If this is not the case, the solution is said to be non-oscillatory. The 
author presents an introduction to this topic and includes references where a more 
advanced treatment can be found. 

There are several features that I especially like about the book. First, it contains 
a very extensive set of exercises at the end of each section. They are used not only 
as applications of the previously given theory but also, in many cases, help the 
reader extend the theoretical discussion given earlier in the section. Second, many 
applications are given for a variety of disciplines. For example, the materials in 
Sections 3.5.1—3.5.3 on Markov chains are excellent. Third, the book also includes 
several programs written especially for the TI-85 calculator. This feature will 
certainly help the reader discover various interesting features of difference equa- 
tions through actual experimentation with the calculator. 

The book does contain some typos and misprints. However, the alert reader can 
locate them rather easily. (My copy of the book came with an errata sheet.) I found 
the book to be well-written and very suitable for a good solid introduction to the 
fundamentals of difference equations, some of their applications, and several 
related advanced topics. 
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Calculus Lite, by Frank Morgan. A K Peters, Wellesley, 1995, 281 pp., $29.95. 


Reviewed by Wayne Roberts 


The calculus reform movement, when it started, had “lean and lively” as its slogan. 
This book makes no claim to having been influenced by the themes of reform, and 
there is considerable evidence to justify the silence. Nevertheless, with respect to 
the elusive goal of being lean this book, only 281 pages from cover to cover with 
each page only 6” by 9”, succeeds where other books inspired by the reform 
movement have conspicuously failed. I have observed at several conferences on 
calculus reform that if mathematicians had but looked at themselves, they might 
have known in the beginning that it is easier to be lively than to be lean. Difficult 
though it may be to write a leaner calculus, however, it does seem worthwhile to 
write a little more about the virtue we hoped to achieve by writing less. 

Stated succinctly, the goal was to focus attention on the key ideas of calculus. It 
was thought that calculus books exceeding one thousand pages in length were 
obscuring the structure of the calculus edifice by dwelling too much on the 
surrounding appurtenances. Where Gauss was concerned lest a fine building be 
obscured by scaffolding, the would-be reformers were concerned that it might be 
obscured by bric-a-brac. 

Let us now ask what has actually happened. Some books have gotten smaller. 
Those that have made the most visible progress have done it, however, by the 
simple expedient of writing a one-year single-variable text, leaving the multivari- 
able material to another book. This somehow misses the spirit of the thing. It is 
true, of course, that many of the newer books do omit topics commonly found in 
their weightier predecessors, but since shorter was not for the sake of shortness 
but for the sake of emphasis on key ideas, we are compelled to ask not if the newer 
books are shorter, but whether they really do focus student attention on the main 
ideas. I am not sure the answer is as clear as we would wish. 

The readers of these newer books are being directed to the right ideas. The 
derivative of a function at a point is the slope of the line tangent to the graph of 
the function at that point. The integral of a function over an interval is the area 
under the graph of the function. The miracle is that these two ideas are related in 
a way that, from the right point of view, seems natural. I doubt, however, that even 
the best of the readers are yet seeing these truths jumping out from their texts. 
Instead of being obscured by more and more material, they are obscured by more 
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and more lengthy explanations. In this respect, Morgan seems closer to the original 
spirit by proclaiming his intention of “getting right to the point, and stopping 
there.” 

Morgan also illustrates, in his handling of the role of continuity in maxima and 
minima problems, how to avoid a tendency of authors that has been likened to 
giving a youngster a first hammer, and then instead of illustrating its usefulness in 
pounding nails, going immediately to a warning that it is not to be used for driving 
screws. After discussing a number of situations in which intuition is a reliable 
guide, he quotes a theorem: A continuous function y = f(x) on a closed interval 
[a, b] attains a maximum and a minimum. He then acknowledges that “it sounds 
right and obvious that a function will have a maximum and a minimum somewhere, 
and the details about ‘continuous’ and ‘on a closed interval [a, Db? do not sound 
important.” There follows a short discussion (half a page) that draws on some 
pictures to illustrate some possible troubles and concludes with, “The proof of this 
delicate theorem depends on a deep understanding of real numbers, continuous 
functions, and the difference between closed and open intervals. You can learn 
about it in a course in real analysis.” 

It is possible to show how something works when everything is going right, to be 
honest about the fact that sometimes things go wrong, to indicate that difficulties 
can be adequately dealt with, and still be concise. It is regrettable that authors 
reaching for the laudable goals of making things seem intuitive and natural so 
often feel that the goal precludes stating an important definition or theorem 
crisply, or saying anything about the necessity of ultimately undertaking a more 
careful analysis. It need not be so. 

There are numerous ways in which Morgan achieves his brevity. He proves that 
the derivative of t” is nt"~' for n = 1,2,3 and then for any integer n, all on one 
page, and shortly (sixteen pages) has the student using the sum, product, quotient, 
and chain rules for differentiation of expressions built from polynomials. He 
reviews all one needs to know about sines and cosines, and derives in a highly 
intuitive way their derivatives, all in eight pages. He restricts discussion of finding 
antiderivatives to the use of substitution (four pages), tables (four pages), parts 
(one and one-half pages) and partial fractions (three pages), about which he says 
that “Focusing ...on distinct, linear factors provides the theory and ninety percent 
of the applications without the time-consuming algebra.” 

At times he abandons any appeal to intuition, resorting to what might best be 
described as blurting out some rules, as he does, for example, in his treatment of 
the laws of logarithms. His unmotivated definition of e as “the bank balance after 
one year of one dollar invested at an interest rate of 100% compounded continu- 
ously” misses a wonderful opportunity to let students discover an important 
number for themselves, and the quick move to (a*)’ = (In a)a* is a deplorable 
example of derivation by decree. 

Letting students discover e for themselves is not an option for Morgan, 
however, determined as he is to write a text that in no way acknowledges the 
existence of calculators. Indeed, it is not until page 193 when, at the close of his 
four-page chapter on numerical methods, he acknowledges that computers exist. 
There certainly are materials written that contrive problems with no apparent 
purpose except to drive the student to technology to help with complex computa- 
tions. If this were the only use to be made of such help, one might agree with 
Morgan’s apparent conclusion that it’s better to focus on ideas to the exclusion of 
any use of calculators or computers. There are better ways to use the tools now 
available, however. Students can discover things they once had to be told (they can, 
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with a calculator and the definition of the derivative, discover the number b for 
which b* will be its own derivative). They can understand important ideas once 
passed over as abstractions that weren’t likely to be on the test anyway (they can, 
with the use of the integrate key on their calculator, make a table of values for 
f(x) = {7 vt? + 4 dt and so come to understand that an integral with a variable 
limit really does define a function). They can work on intrinsically interesting 
problems that don’t have to be avoided because of computational complexity. To 
completely ignore this potential in a book with a 1995 copyright, as Morgan does, 
seems curious at best. 

There are numerous other ways in which Morgan makes it clear that he is not, 
with his lean book, trying to cozy up to the reformers. There is nothing in his 
choice of applications that yields to the call to deal with topics that students will 
see as interesting, relevant, or exciting; he is still fencing animals in rectangular 
pens. The exercise lists are very much of the type that can be demolished at a 
keyboard while little learning takes place. There is no concession to the idea of 
including some open-ended questions that invite alternative approaches, that lend 
themselves to extended investigation over time, that offer the opportunity to 
exercise imagination, or that demand a coherent defense of the methodology 
chosen or the answer obtained. 

Frank Morgan has demonstrated that a lean book can be written in a way that 
highlights key ideas. His clear delineation of differentiation, antidifferentiation, 
the integral, and the place of the fundamental theorem is exemplary in this regard. 
That is a contribution toward the goals the reformers had in mind. This reviewer, 
for one, wished he had put his gift for clear exposition in the employ of other goals 
of the reform movement. 
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TELEGRAPHIC REVIEWS 


Edited by Arnold Ostebee 


with the assistance of the Mathematics Departments of 
Carleton, Macalester, and St. Olaf Colleges 


Telegraphic Reviews are designed to alert readers in a timely manner to new books 
appropriate to mathematics teaching and research. Special codes classify reviews by 


subject area and appropriate use: 


T : Textbook 
C : Computer Software 


S : Supplementary Reading 


P : Professional Reading 
L : Undergraduate Library 
13: Grade Level 


Readers are advised that price information is subject to change. 


1—4: Semester 

** + Special Emphasis 

?? : Questionable 
Selected books 


receive a second, more extensive review in the Monthly. 


Books submitted for review should be sent to Book Reviews Editor, American Mathe- 
matical Monthly, St. Olaf College, 1520 St. Olaf Avenue, Northfield, MN 55057-1098. 


Mathematics Appreciation, P, L*. Mathemat- 
ics: The Science of Patterns: The Search for 
Order in Life, Mind, and the Universe. Keith 
Devlin. Scientific American Library, 1997, vii 
+ 216 pp, $19.95 (P). [ISBN 0-7167-6022-3; 0- 
7167-5047-3] Updated to reflect recent proof 
of Fermat’s last theorem. Incorporates minor 
corrections and a few additions to list of books 
for further reading (TR, April 1995). AO 


Recreational Mathematics, S, L. Which Way 
Did the Bicycle Go? ... and Other Intrigu- 
ing Mathematical Mysteries. Joseph D.E. Kon- 
hauser, Dan Velleman, Stan Wagon. Dolciani 
Math. Expos., No. 18. MAA, 1996, xv + 235 pp, 
$24.95 (P). [ISBN 0-88385-325-6] Almost 
two-hundred problems selected from 25 years 
of Macalester College’s “Problem of the Week.” 
Most of the problems should be accessible to (if 
not always solvable by) high school students. 
Solutions and references. Lots of fun. JO 


Recreational Mathematics, S. A Mathemati- 
cal Jamboree. Brian Bolt. Cambridge Univ Pr, 
1995, 111 pp, $17.95 (P). [ISBN 0-521-48589- 
4] 114 miscellaneous mathematical puzzles, 
some new and some old chestnuts. Most are 
appropriate for high school students, but a few 
are harder. Includes solutions and instructions 
for building several kinds of harmonographs 
(Spirograph™-like devices). JO 


Recreational Mathematics, S. Winning So- 
lutions. Edward Lozansky, Cecil Rousseau. 
Problem Books in Math. Springer-Verlag, 
1996, x + 244 pp, $34.95 (P). [ISBN 0- 
387-94743-4] Intended to bridge the gap be- 
tween the mathematics typically taught in high 
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schools and the mathematics required for suc- 
cess in high-level mathematical competitions. 
Sections on number theory, inequalities, and 
combinatorics. JO 


Recreational Mathematics, P. The Red 
Book of Mathematical Problems. Kenneth S. 
Williams, Kenneth Hardy. Dover, 1996, ix 
+ 174 pp, $6.95 (P). [ISBN 0-486-69415-1] 
Republication, with corrections, of The Red 
Book: 100 Practice Problems for Undergrad- 
uate Mathematics Competitions published by 
Integer Press in 1988 (TR, February 1989). 


Precalculus, T(13: 1). Workshop Calculus: 
Guided Exploration with Review, Volume 1. 
Nancy Baxter Hastings, et al. Springer-Verlag, 
1997, xxii + 391 pp, $29.95 (P). [ISBN 0-387- 
94611-X] Laboratory workbook. Integrates 
precalculus with topics from calculus including 
limits, the derivative, and the definite integral. 
Stresses individual and group discovery using a 
computer algebra system. PG 


Precalculus, T(13). Algebra with Applications. 
William J. Adams. Kendall/Hunt, 1995, xii + 
361 pp, $52.44 (P). [ISBN 0-7872-0996-1] 
Curiously old-fashioned treatment with many 
“hows” and few “whys.” Pre-modern math era 
treatment uses “transposition” (transfer a term 
from one side of an equation to another and 
change its sign) as a basic operation for solving 
equations and inequalities. Similar “mindless 
rule” approach to most topics. Of questionable 
value for any purpose. MW 


Education, S(15-17). Ideas: NCTM Standards- 
Based Instruction. Ed: Michael C. Hynes. 
NCTM, $11.50 (P) each. Grades K—4, 1995, 
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119 pp, [ISBN 0-87353-422-0]; Grades 5-8, 
1996, v + 129 pp. [ISBN 0-87353-426-3] 
Each book contains nearly 50 reproducible one- 
page activity sheets adapted from the “Ideas” 
department in the Arithmetic Teacher. Good 
source of activities for pre-service and in- 
service mathematics teachers. MW 


Education, T(14—-16: 2). Mathematics for El- 
ementary Teachers: An Interactive Approach, 
Second Edition. Thomas Sonnabend. Saunders 
College, 1997, xxii + 928 pp, $51.50. [ISBN 
0-03-018367-7] Straightforward presentation 
of standard topics with interspersed “lesson ex- 
ercises” and frequent calls for students to ex- 
plain or predict outcomes. This edition up- 
dates suggested readings and technology cov- 
erage (though technology is still not integral). 
(First Edition, TR, May 1994.) MW 
Education, P. Towards Gender Equity in Math- 
ematics Education: An ICMI Study. Ed: Gila 
Hanna. ICMI Stud. Ser., V. 3. Kluwer Aca- 
demic, 1996, xii + 304 pp, $133. [ISBN 0- 
7923-3921-5] Articles explore issues raised at 
1993 International Commission on Mathemat- 
ical Instruction Conference. Describes 13 na- 
tional education systems from the perspective 
of gender equity, opportunity, classroom expe- 
riences, and educational outcomes. Research 
summaries; cross-cultural perspectives. MW 


Education, §(17-18), P. A Cognitive Analy- 
sis of U.S. and Chinese Students’ Mathematical 
Performance on Tasks Involving Computation, 
Simple Problem Solving, and Complex Problem 
Solving. Jinfa Cai. Journal for Res. in Math. 
Educ., Mono. No. 7. NCTM, vii + 151 pp, 
$7.50 (P). [ISBN 0-87353-424-7] Analyzes 
performance of U.S. and Chinese 6th grade 
students on different types of tasks. Chinese 
students substantially outperformed U.S. stu- 
dents on computation tasks, but not on complex 
problem-solving tasks. Surprisingly, Chinese 
students had much higher non-response rates 
on open-ended problems. MW 

Logic, T(15-16: 1, 2), L. Set Theory, Logic 
and their Limitations. Moshé Machover. Cam- 
bridge Univ Pr, 1996, ix + 288 pp, $24.95 (P); 
$60. [ISBN 0-521-47998-3; 0-521-47493-0] 
Lecture notes for philosophy and mathematics 
students. Covers axiomatic set theory, cardi- 
nals and ordinals, propositional and first-order 
logic, and limitative results of Skolem, Tarski, 
Church, and Gédel. Concise presentation with 
few examples. Good remarks on history and 
philosophical issues. KES 


$146. [ISBN 0-7923-4366-2] Henkin used 
the method of constants to construct interpre- 
tations for first-order languages. Keisler mod- 
ified the method to motivate the ultraproduct 
construction. Monograph describes Keisler’s 
method and illustrates its use as an alternative 
to ultraproducts. KES 


Foundations, T(14: 1), L. A Transition to 
Advanced Mathematics, Fourth Edition. Dou- 
glas Smith, Maurice Eggen, Richard St. An- 
dre. Brooks/Cole, 1997, viii + 344 pp, $62.95. 
[ISBN 0-534-34028-8] Discusses methods of 
mathematical proof. Covers material on sets, 
relations, functions, cardinality, groups, and the 
completeness of the real numbers. (Third Edi- 
tion, TR, November 1990; Extended Review, 
February 1991.) PG 


Combinatorics, P. Sperner Theory. Konrad 
Engel. Ency. of Math. & Its Applic., V. 65. 
Cambridge Univ Pr, 1997, ix + 417 pp, $69.96. 
[ISBN 0-521-45206-6] Starts with statement 
and proof of Sperner’s theorem. Develops the- 
ory of extremal problems on finite partially 
ordered sets. Covers flow-theoretic approach 
in Sperner theory; matchings and symmetric 
chain orders; probabilistic and algebraic meth- 
ods; Macaulay posets. LC 


Combinatorics, T(17), P. One-Factorizations. 
W.D. Wallis. Math. & Its Applic., V. 390. 
Kluwer Academic, 1997, xiv + 242 pp, $143. 
[ISBN 0-7923-4323-9] A one-factorization of 
a graph is a decomposition of the edge set of 
the graph into edge disjoint factors (spanning 
graphs) which are regular of degree 1. This text 
covers one-factors and one-factorizations, their 
applications, and connections to design theory. 
Includes exercises, extensive references. LC 


Combinatorics, T(15—17: 1, 2), L. Graphs 
& Digraphs, Third Edition. Gary Chartrand, 
Linda Lesniak. Chapman & Hall, 1996, x 
+ 422 pp, $59.95. [ISBN 0-412-98721-X] 
Changes from Second Edition (TR, December 
1986): expanded treatment of domination in 
graphs, Hamiltonian graphs, graph decompo- 
sitions, extremal graph theory; introductions to 
voltage graphs, graph labelings, and probabilis- 
tic graph theory; additional exercises; updated 
bibliography. KES 


Combinatorics, T(17-18: 1), P, L. Integer 
Flows and Cycle Covers of Graphs. Cun-Quan 
Zhang. Pure & Appl. Math., V. 205. Marcel 
Dekker, 1997, xii + 379 pp, $145. [ISBN 0- 
8247-9790-6] The problem of face-coloring a 


Logic, P. Henkin-Keisler Models. George planar graph can be formulated in terms of in- 
Weaver. Math. & Its Applic., V. 392. teger flows or cycle covers. This introduction 
Kluwer Academic, 1997, xii + 253 pp, presents main graph-theoretic results and open 
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problems. Requires only basic background in 
graph theory. LB 


Number Theory, T(13—14). The Theory of 
Remainders. Andrea Rothbart. Janson, 1995, 
xii + 178 pp, $19.95 (P). [ISBN 0-939765-82- 
9] Whimsical, but substantial, look at modu- 
lar structures narrated by Ant (Amateur Num- 
ber Theorist) and Gnam (Game Nut and Magi- 
cian). Developed to give middle school teachers 
a conceptual and connected look at authentic 
algebra. Suitable for liberal arts mathematics 
students as well as sophomore majors in “tran- 
sition” courses. Exercises advance conceptual 
development. Optional problems accommodate 
different levels of sophistication. MW 


Linear Algebra, T(18: 2), P. Matrix Analysis. 
Rajendra Bhatia. Grad. Texts in Math., V. 169. 
Springer-Verlag, 1997, xi + 347 pp, $49.95. 
[ISBN 0-387-94846-5] Advanced matrix the- 
ory with an emphasis on functional analytic 
properties. An excellent resource for matrix 
inequalities. BK 


Algebra, T(16-17: 1), S**, L. Geometry of 
the Quintic. Jerry Shurman. Wiley, 1997, xi 
+ 200 pp, $39.95 (P). [ISBN 0-471-13017- 
6] Interesting exploration of a single problem; 
draws on a range of (usually disparate) under- 
graduate topics: Riemann sphere and analysis, 
groups and algebra, icosahedral (and basic al- 
gebraic) geometry; uses ideas from Felix Klein 
through recent work on iterative solution meth- 
ods. A nice “capstone” read for an ambitious 
math major. RM 


Algebra, P. Finite Fields: Normal Bases and 
Completely Free Elements. Dirk Hachenberger. 
Intern. Ser. Eng. & Comp. Sci. Kluwer Aca- 
demic, 1997, xii + 171 pp, $87.50. [ISBN 0- 
7923-9851-3] Considers the characterization, 
enumeration, and explicit construction of com- 
pletely free elements in arbitrary finite dimen- 
sional extensions over finite fields. Assumes 
familiarity with basic Galois theory and finite 
field theory. CEC 


Algebra, P. A Survey of Modern Algebra. Gar- 
rett Birkhoff, Saunders MacLane. AK Peters, 
1997, xi+ 500 pp, $59. [ISBN 1-56881-068-7] 
A corrected reprint of the Fourth Edition (TR, 
August-September 1977) by a new publisher. 


Algebra, T(15: 1), S, L. Numbers and Sym- 
metry: An Introduction to Algebra. Bernard 
L. Johnston, Fred Richman. CRC Pr, 1997, 
260 pp, $29.95 (P). [ISBN 0-8493-0301-X] 
Introduces mathematical structures in concrete 
settings. Gaussian integers and polynomials are 
emphasized in the study of rings. Symmetry, 
to the extent of classifying wallpaper patterns, 
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is the motivator for groups; linear algebra is 
used to study error-correcting codes over finite 
fields. CEC 


Complex Analysis, T(17: 1). Topics in Com- 
plex Analysis. Mats Andersson. Universi- 
text. Springer-Verlag, 1997, viii + 157 pp, 
$32.50 (P). [ISBN 0-387-94754-X] Moves 
quickly through fundamental results and fo- 
cuses on connections to real analysis and har- 
monic analysis, e.g., Fatou’s theorem, Nevan- 
linna theory, corona theory, H”? theory, and 
H'!-BMO duality. Assumes background in 
real analysis, integration theory, and functional 
analysis. PG 


Complex Analysis, T*(16-17: 2). Complex 
Analysis, Second Edition. Joseph Bak, Don- 
ald J. Newman. Undergrad. Texts in Math. 
Springer-Verlag, 1997, x + 294 pp, $39.95. 
[ISBN 0-387-94756-6] No major changes in 
this edition. Still a very attractive textbook. 
(First Edition, TR, April 1983; Extended Re- 
view, March 1990.) TAV 


Differential Equations, T*(16—17), L. Non- 
linear Differential Equations and Dynami- 
cal Systems, Second Revised and Expanded 
Edition. Ferdinand Verhulst. Universitext. 
Springer-Verlag, 1996, x + 303 pp, $32.50 (P). 
[ISBN 0-540-60934-2] A nonponderous text 
for a second course in ODEs. Covers basic con- 
cepts of nonlinear DEs and stability analysis; 
defers bifurcation theory, chaos, and Hamilto- 
nian systems to the end. Many examples and 
exercises with answers and hants. Few applica- 
tions, but a nice theoretical text for those mainly 
interested in applications. (First Edition, TR, 
November 1990.) DK 


Differential Equations, S(14), C. The Maple 
O.D.E. Lab Book. Darren Redfern, Edgar 
Chandler. Springer-Verlag, 1996, ix + 160 pp, 
$29.95 (P), with disk. [ISBN 0-387-94733-7] 
Introduction to the use of Maple V Release 4 for 
solving and analyzing first-order ODEs and sys- 
tems. Includes explicit and implicit solutions, 
direction fields, phase planes, matrix methods, 
numerical methods, and applications. PG 


Differential Equations, T(14).  Differen- 
tial Equations with Boundary-Value Problems, 
Fourth Edition. Dennis G. Zill, Michael R. 
Cullen. Brooks/Cole, xv + 652 pp, $76. [ISBN 
0-534-95580-0] Changes from Third Edition 
include increased emphasis on using technol- 
ogy, non-linear equations and systems, bound- 


ary value problems, and modeling. (Second 

Edition, February 1989.) SK 

Differential Equations, P. A Trea- 

tise on Differential Equations, Sixth Edi- 
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tion. A.R. Forsyth. Dover, 1996, xviii + 
583 pp, $14.95 (P). [ISBN 0-486-693 14-7] 
Unabridged republication of the 1956 printing 
of the 1929 Macmillan & Co. edition. 


Differential Equations, P. Catastrophe The- 
ory and Its Applications. Tim Poston, Ian Stew- 
art. Dover, 1996, xviii + 491 pp, $18.95 (P). 
[ISBN 0-486-69271-X] Unabridged republi- 
cation of the 1956 printing of the 1929 Macmil- 
lan & Co. edition. (1978 Pitman hardcover edi- 
tion, TR, August-September 1978.) 


Dynamical Systems, T**(14—-16), L. Chaos: 
An Introduction to Dynamical Systems. Kath- 
leen T. Alligood, Tim D. Sauer, James A. Yorke. 
Springer-Verlag, 1997, xvii + 603 pp, $39 (P). 
[ISBN 0-387-94677-2] An excellent text for 
an introductory dynamics course. Informal, but 
precise, mathematical style; quite accessible 
and motivating. Wonderful chapter-by-chapter 
“Lab Visits’ (diverse applications of the con- 
cepts) and “Challenges” (short-term projects). 
Highly recommended. DK 


Dynamical Systems, T(16-18), L.  Intro- 
duction to the Theory of Stability. David 
R. Merkin. Transl. & Ed.: Fred F. Afagh, 
Andrei L. Smirnov. Texts in Appl. Math., 
V. 24. Springer-Verlag, 1997, xx + 319 pp, 
$49.95. [ISBN 0-387-94761-2] Translation 
of the Third Edition of a Russian text. Heavy 
emphasis on applications; roughly 1/4 of the 
text is given over to engineering and science 
applications and examples. DK 


Numerical Analysis, S(16—18), P. Dynamical 
Systems and Numerical Analysis. A.M. Stuart, 
A.R. Humphries. Cambridge Univ Pr, 1996, 
Xxii + 685 pp, $59.95. [ISBN 0-521-49672-1] 
An in-depth research monograph suitable for 
graduate courses in numerical analysis. Treats 
numerical computation as a dynamical system 
and examines two main issues: convergence of 
numerical attracting sets to true sets; preserva- 
tion of structura] properties (e.g., the Hamil- 
tonian) by the numerical method. Examples 
throughout; many exercises. DK 


Numerical Analysis, P. Domain Decompo- 
sition: Parallel Multilevel Methods for Elliptic 
Partial Differential Equations. Barry F. Smith, 
Petter E. Bjgrstad, William D. Gropp. Cam- 
bridge Univ Pr, 1996, xii + 224 pp, $39.95. 
[ISBN 0-521-49589-X] Domain decomposi- 
tion algorithms are used to break a PDE into 
subproblems to facilitate parallel computation 
of the solution. This book presents many do- 
main decomposition algorithms and discusses 
their implementation. JO 


Operator Theory, P. Haar Series and Lin- 
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ear Operators. Igor Novikov, Evgenij Se- 
menov. Math. & Its Applic., V. 367. Kluwer 
Academic, 1997, xv + 218 pp, $118. [ISBN 
0-7923-4006-X] Treats unconditional conver- 
gence; Fourier-Haar coefficients; reproducibil- 
ity; martingales; monotone bases; criterion 
of equivalence of Haar and Franklin systems. 
Sadly, a quick reading caught several errors—a 
typo in the preface, a misspelling in the ac- 
knowledgement, an incorrect reference to an 
item in the bibliography (which, incidentally, 
contains 354 items!). KS 


Functional Analysis, P. The Descriptive Set 
Theory of Polish Group Actions. Howard 
Becker, Alexander S. Kechris. London Math. 
Soc. Lect. Note Ser., V. 232. Cambridge Univ 
Pr, 1996, xi + 136 pp, $34.95 (P). [ISBN 0- 
521-57605-9] 


Analysis, T**(14: 1, 2), L*. An Introduction 
to Analysis. Gerald G. Bilodeau, Paul R. Thie. 
Intern. Ser. in Pure & Appl. Math. McGraw- 
Hill, 1997, xi + 225 pp, $68.75. [ISBN 0-07- 
005662-5] A concise introductory text for a 
beginning analysis course. The topics are stan- 
dard, from the development of the reals through 
uniform convergence, but the writing is espe- 
cially clear. Extensive problem sets with in- 
teresting and well-chosen problems. A very 
attractive option; well worth considering. TAV 


Analysis, T(16—18), S, L. A Mathematical In- 
troduction to Wavelets. P. Wojtaszczyk. Lon- 
don Math. Soc. Stud. Texts, V. 37. Cam- 
bridge Univ Pr, 1997, xii + 261 pp, $21,95 (P); 
$59.95. [ISBN 0-521-57894-9; 0-521-57020- 
4] Though the literature on the ‘newish’ sub- 
ject of wavelets is enormous, there 1s still a lack 
of readable introductory texts. This is one of 
many new books aimed at filling this gap. It is 
intended for (pure) math students who have a 
solid background in real analysis, and a knowI- 
edge of Lebesgue theory, Fourier series, and 
Hilbert spaces. KS 


Analysis, P. A Course of Modern Analysis, 
Fourth Edition. E.T. Whittaker, G.N. Watson. 
Cambridge Univ Pr, 1996, 608 pp, $49.95 (P). 
[ISBN 0-521-58807-3] A reprint of the 1927 
edition. 


Analysis, P. Asymptotic Attainability. A.G. 
Chentsov. Math. & Its Applic., V. 383. Kluwer 
Academic, 1997, xiv + 322 pp, $156. [ISBN 
0-7923-4302-6] 


Analysis, T?(18: 2). Measures and Probabili- 
ties. Michel Simonnet. Universitext. Springer- 
Verlag, 1996, xiii + 510 pp, $44 (P). [ISBN 
0-387-94644-6] The author describes his ap- 
proach to measure and integration as an “‘intro- 
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ductory, yet sophisticated treatment.” It is intro- 
ductory in the sense that no previous course in 
measure is assumed, but the rudiments of topol- 
ogy and functional analysis are. It certainly 
is sophisticated as it uses the development of 
Daniell measure to provide results on Radon as 
well as abstract measures on sets. In three parts: 
Daniell Measures, Measures on Subrings, and 
Convergence of Random Variables. TAV 


Algebraic Geometry, T(18), P. Lectures on 
Vector Bundles. J. Le Potier. Transl: A. Ma- 
ciocia. Stud. in Adv. Math., V. 54. Cambridge 
Univ Pr, 1997, viii + 251 pp, $59.95. [ISBN 0- 
521-48182-1] Lectures from two courses on 
moduli spaces of vector bundles. Part 1 con- 
cerns the classification of vector bundles on 
algebraic curves. Part 2 concerns semi-stable 
sheaves on the projective plane. Assumes back- 
ground in algebraic geometry, Chern classes, 
and algebraic sheaves. JD 


Differential Geometry, T(18: 1), P. Confor- 
mal Differential Geometry and Its Generaliza- 
tions. Maks A. Akivis, Vladislav V. Gold- 
berg. Pure & Appl. Math. Wiley, 1996, xiv 
+ 383 pp, $69.96. [ISBN 0-471-14958-6] A 
conformal space is, roughly, an n-dimensional 
sphere equipped with the group of conformal 
transformations on the sphere. Conformal dif- 
ferential geometry concerns such spaces and 
related ideas. This book gives a thorough in- 
troduction to the subject. JO 


Algebraic Topology, S(17), P, L. A User’s 
Guide to Algebraic Topology. C.T.J. Dod- 
son, Phillip E. Parker. Math. & Its Applic., 
V. 387. Kluwer Academic, 1997, xii + 405 pp, 
$209. [ISBN 0-7923-4292-5] Not a text— 
many proofs are merely referenced. Covers ex- 
tension and lifting problems, homotopy, coho- 
mology, sheaves, bundles, obstruction theory. 
Appendices on algebra and general topology. 
Applications to theoretical physics. JD 


Topology, P. The Hauptvermutung Book: A 
Collection of Papers of the Topology of Man- 
ifolds. Ed: A.A. Ranicki, et al. K-Mono. 
in Math., V. 1. Kluwer Academic, 1996, 
vi + 190 pp, $108. [ISBN 0-7923-4174-0] 
The Hauptvermutung is the conjecture that any 
two triangulations of a polyhedron (or a mani- 
fold) are combinatorially equivalent. This book 
brings together the literature on the subject, in- 
cluding proofs of several results that have been 
assumed but unpublished up to now. JO 

Topology, T(18), P. Theory of Degrees with 
Applications to Bifurcations and Differential 
Equations. Wieslaw Krawcewicz, Jianhong 
Wu. Canadian Math. Soc. Ser. of Mono. & 
Adv. Texts. Wiley, 1997, xiv + 374 pp, $89.95. 
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[ISBN 0-471-15740-6] A unified treatment of 
Brouwer (and extensions), Dold-Ulrich, and S! - 
degrees and their relation to the existence of bi- 
furcations of ODEs. Requires some knowledge 
of analysis, topology, and ODEs. SK 


Optimal Control, P. Geometric Control The- 
ory. Velimir Jurdjevic. Stud. in Adv. Math., 
V.51. Cambridge Univ Pr, 1997, xviii + 492 pp, 
$79.95. [ISBN 0-521-49502-4] 


Optimal Control, P. Nonlinear H 5. Con- 
trol: The Singular Case. W.C.A. Maas. CWI 
Tract, No. 118. Stichting Mathematisch Cen- 
trum, 1996, v + 197 pp, Dfl. 40 (P). [ISBN 
90-6196-468-7] 


Probability, T*(17-18: 1, 2). A Modern Ap- 
proach to Probability Theory. Bert Fristedt, 
Lawrence Gray. Prob. & Its Applic. Birkhauser 
Boston, 1997, xx + 756 pp, $64.50. [ISBN 
0-8176-3807-5] Another possible title is “Ev- 
erything You Need to Know About the Theory 
of Probability.’ This ambitious work covers 
the essentials in a clear and readable fashion. 
From &-fields to sums and convergences to in- 
teracting particle systems with all the stops in 
between. A must for professionals and an at- 
tractive text for a graduate course. Extensive 
appendices, credits, and comments. TAV 


Statistical Methods, T(17—18: 1). Ap- 
plied Factor Analysis in the Natural Sciences. 
Richard A. Reyment, K.G. Jéreskog. Cam- 
bridge Univ Pr, 1996, xii + 371 pp, $39.95 (P); 
$54.95. [ISBN 0-521-57556-7; 0-521-41242- 
0] Provides an introduction to multivariate 
data analysis, aspects of linear algebra, and 
factor analysis. Discusses R-mode, Q-mode, 
and Q-R-mode methods. Contains examples 
and case histories from geology and ecology. 
Appendix contains MATLAB computer pro- 
grams. RS 


Statistical Methods, T(17-18), P. Nonpara- 
metric Methods for Quantitative Analysis, Third 
Edition. Jean Dickinson Gibbons. American 
Sciences Pr, 1997, xvi + 537 pp, $70.95 (P). 
[ISBN 0-935950-37-0] New edition incor- 
porates computer package (MINITAB, SAS, 
SPSS) solutions in many examples. Several 
new topics; expanded problem sets. (Second 
Edition, TR, November 1986.) RS 


Statistics, P. Statistical Test Limits in Qual- 
ity Control. G.D. Otten. CWI Tract, No. 116. 
Stichting Mathematisch Centrum, 1996, vi + 
144 pp, Dfl. 35 (P). [ISBN 90-6196-469-5] 


Algorithms, T(18: 1), P. Parallel Algorithms 
for Regular Architectures: Meshes and Pyra- 
mids. Russ Miller, Quentin F. Stout. MIT Pr, 
1996, xvii + 310 pp, $40. [ISBN 0-262-13233- 
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8] Good presentation of parallel algorithms 
for mesh and pyramid parallel architectures. 
Considers sorting and matrix multiplication as 
well as many problems from image processing, 
computational geometry, and graph theory. JO 


Theory of Computation, T(15: 1), L. Lan- 
guages and Machines: An Introduction to the 
Theory of Computer Science, Second Edition. 
Thomas A. Sudkamp. Addison-Wesley, 1997, 
xv + 569 pp, $45.95. [ISBN 0-201-82136-2] 
Good exposition, proofs, and coverage. Con- 
nections to practical computing include a sec- 
tion on a context-free grammar describing Pas- 
cal, and sections on LL(k) and LR(k) gram- 
mars. JO 


Computer Science, P. Logic Programming. 
Ed: Michael Maher. MIT Pr, 1996, xix + 
554 pp, $85 (P). [ISBN 0-262-63173-3] Pro- 
ceedings of the 1996 Joint International Confer- 
ence and Symposium on Logic Programming 
held in Bonn, Germany. 


Computer Science, P. Computational Differ- 
entiation: Techniques, Applications, and Tools. 
Eds: Martin Berz, et al. SIAM, 1996, xv + 
421 pp, $65 (P). [ISBN 0-89871-385-4]  Pa- 
pers from the Second International Workshop 
on Computational Differentiation held in Febru- 
ary 1996 in Santa Fe, New Mexico. 


Computer Science, T(15—17: 1), S**, P. Al- 
gebra of Programming. Richard Bird, Oege 
de Moor. Intern. Ser. in Comp. Sci. Prentice 
Hall, 1997, xiv + 295 pp. [ISBN 0-13-507245- 
X] One may calculate, algebraically, a pro- 
gram for a class of problems (parametrized by 
a data type) whose specification satisfies some 
structure by proving a theorem which asserts a 
strategy (greedy, dynamic programming, etc.) 
applies, then algebraically instantiating the al- 
gorithm (e.g., from the proof). Nice develop- 
ment based on categorical calculus of relations 
instantiated into functional programming lan- 
guage. RM 


Computer Science, P.. Graph Reduction on 
Shared-Memory Multiprocessors. K.G. Lan- 
gendoen. CWI Tract, No. 117. Stichting 
Mathematisch Centrum, 1996, vii + 199 pp, 
Dfl. 40 (P). [ISBN 90-6196-470-9] 


Applications (Biological Science), P. Next 
Generation Environmental Models and Com- 
putational Methods. Eds: George Delic, Mary 
F Wheeler. SIAM, 1997, xii +375 pp, $65 (P). 
[ISBN 0-89871-378-1] Proceedings of a 1995 
workshop at the National Environmental Super- 
computing Center, Bay City, Michigan. Topics 
include global and regional circulation models, 
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aquatic systems, groundwater transport of con- 
taminants, and inverse problem methods. 


Applications (Communication Theory), P. 
Chaotic, Fractal, and Nonlinear Signal Pro- 
cessing. Ed: Richard A. Katz. American 
Institute of Physics Pr, 1996, xxi + 847 pp, 
$165. [ISBN 1-56396-443-0] Technical pa- 
pers from the ONR/NUWC Third Technical 
Conference on Nonlinear Dynamics and Full- 
spectrum Processing held in 1995 in Mystic, 
Connecticut. 


Applications (Mechanics), P. Global Bifur- 
cation in Variational Inequalities: Applications 
to Obstacle and Unilateral Problems. Vy Khoi 
Le, Klaus Schmitt. Appl. Math. Sci., V. 123. 
Springer-Verlag, 1997, xiv + 250 pp, $59.95. 
[ISBN 0-387-94886-4] 


Applications (Quantum Theory), P. Proceed- 
ings Seminar 1989-1990; Mathematical Struc- 
tures in Field Theory. Eds: E.A. de Kerf, 
H.G.J. Pils. CWI Syllabus, No. 39. Sticht- 
ing Mathematisch Centrum, 1996, 163 pp, 
Dfl. 40 (P). [ISBN 90-6196-448-2] Selected 
lectures from a seminar at the University of Am- 
sterdam. 


Applications (Systems Theory), P. Lecture 
Notes in Control and Information Sciences— 
223: Experimental Robotics lV. Eds: O. Khatib, 
J.K. Salisbury. Springer-Verlag, 1997, xix + 
574 pp, $86 (P). [ISBN 3-540-76133-0] Pro- 
ceedings of the Fourth International Sympo- 
sium held at Stanford University in 1995. 


Applications, P. Introduction to the Mathemat- 
ics of Inversion in Remote Sensing and Indirect 
Measurements. S. Twomey. Dover, 1996, x 
+ 243 pp, $9.95 (P). [ISBN 0-486-69451-8] 
Republication, with corrections, of Volume 3 of 
Developments in Geomathematics published by 
Elsevier Scientific in 1977. 


Applications, P. Chaos and the Changing Na- 
ture of Science and Medicine: An Introduction. 
Eds: Donald E. Herbert, et al. American Insti- 
tute of Physics Pr, 1996, xi + 203 pp, $130. 
[ISBN 1-56396-442-2] Papers from a 1995 
workshop in Mobile, Alabama. 
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CEC: Clifton E. Corzatt, St. Olaf; JD: Jill Dietz, St. Olaf; 
PG: Philip Gloor, St. Olaf; DK: Danny Kaplan, Macalester; 
SK: Steve Kennedy, Carleton; BK: Brenda Kroschel, 
Macalester; RM: Richard Molnar, Macalester; JO: Jeff 
Ondich, Carleton; AO: Amold Ostebee, St. Olaf; KS: 
Karen Saxe, Macalester; RS: Richard Single, St. Olaf; 
KES: Kay E. Smith, St. Olaf; LAS: Lynn Arthur Steen, 
St. Olaf; TAV: Theodore A. Vessey, St. Olaf; MW: Martha 
Wallace, St. Olaf. 


[October 


THE AUTHORS 


NORBERT PEYERIMHOFF received his Ph.D. in 1993 at the University of Augsburg, under the 
direction of Jochen Bruning. In 1994-1996, as a postdoctorate fellow by the Deutsche Forschungsge- 
meinschaft (DFG), he enjoyed the great hospitality of the Graduate Center of the City University of 
New York (CUNY) and, in particular, the differential geometry group. In New York, he enjoyed lessons 
in ballroom and Latin dancing and spent a lot of money on mathematics books. Recently, he accepted a 
position as Assistant at the University of Basel. 
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research is structure theory of Hopf algebras. 


TOM RICHMOND received his Ph.D. from Washington State University in 1986 under the direction of 
Darrell C. Kent. Since that time he has been at Western Kentucky University where he is professor of 
mathematics. His research interests include partial orders and topology. 


ALBERT FATHI obtained his Ph.D. from Université Paris-Sud in 1980. He held positions in CNRS- 
Université Paris-Sud and University of Florida, Gainesville. He is now at Ecole Normale Supérieure de 
Lyon. His main interests are dynamical systems and geometry. 


JEFF KNISLEY received his Ph.D. at Vanderbilt University in 1990, and is now at East Tennessee 
State University in Johnson City, Tennessee. His interests include applied mathematics and operator 
theory, and he is working with a neurophysiologist to study mathematical models of the flow of 
electricity through the dendrites of a neuron. 


ARNOLD OSTEBEE was raised on a farm in Iowa. As an undergraduate at St. Olaf College, he 
majored in mathematics and physics. Subsequently, he completed his Ph.D. in mathematical physics 
(rigorous statistical mechanics) at SUNY /Stony Brook. He returned to St. Olaf as a member of the 
faculty in 1980. He now serves as chair of St. Olaf’s mathematics department and as the editor of the 
Telegraphic Reviews section of the MONTHLY. 


PAUL ZORN was born in India, and had his primary and secondary schooling there. He did his 
undergraduate work at Washington University in St. Louis, and his Ph.D., in complex analysis, at the 
University of Washington, Seattle. Since 1981 he has been on the mathematics faculty at St. Olaf 
College. His editorial work includes service with FOCUS, the New Mathematical Library, and the 1994 
and 1995 editions of What’s Happening in the Mathematical Sciences. He is the editor of Mathematics 
Magazine. 


1997] THE AUTHORS 789 


JIM KAPUT began professional life as a mathematician working in category theory with a Ph.D. from 
Clark University under John Kennison in 1968. In the early 1970s he turned his attention to college 
student mathematical learning and thinking. Since then he has been tracing the developmental and 
semiotic path of mathematical learning backwards to earlier grade levels, and examining the represen- 
tational roles of technology in learning, teaching, and thinking. He also enjoys wearing out running 
shoes and hiking boots. 


RICHARD ASKEY started to learn calculus out of two books, the then mainstream book by Sherwood 
and Taylor and a very old fashioned book that had been used at West Point early in the century, to 
judge by the comments in the margin. A bit of each stuck, but no longer taught techniques such as 
integration by undetermined coefficients really made it easier to understand such things as the 
fundamental theorem of calculus. As a result, he has always wondered about the wisdom of trying to 
teach ideas without teaching technical skills at the same time. 


JOSEPH KUPKA was born in 1942 and is not dead yet. He received a Ph.D. in analysis from UC 
Berkeley in 1970 and has been teaching at Monash ever since. He expects that the Berkeley math 
professors would be aghast were they to learn of his post-burnout interests in combinatorial problems 
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APOLONIUSZ TYSZKA was graduated from Jagellonian University. 


DAN CHRISTENSEN received his bachelor’s degree in pure and applied mathematics from the 
University of Waterloo after toying with computer science for a few years. He is completing graduate 
studies at M.I.T. in homotopy theory, and occasionally finds time to hike the nearby hills and paddle 
the rivers. The work described in his article was done during the summer of 1995 when Dan was 
supervising Mark in the Research Science Institute program at MIT. 


MARK TILFORD is a member of the class of 2000 at Caltech. He completed calculus in ninth grade 
and continued with math classes at Maryville University and Washington University. During the 
summers, he studied at TIP at Duke, YSP at Rose-Hulman, and the Research Science Institute, 
sponsored by the Center for Excellence in Education and MIT, where he worked with Dan Christensen 
on the subset take-away problem. In 1996, Mark received an Honorable Mention on the USAMO and 
Honorable Mention on the Putnam Competition. In his spare time, he collects mysteries, and books 
and articles by Martin Gardner. 


RONALD MICKENS received the Ph.D. in theoretical physics from Vanderbilt University in 1968. For 
the next two years he held a National Science Foundation Post-doctoral Fellowship at the Center for 
Theoretical Physics, Massachusetts Institute of Technology. After a decade of teaching and research at 
Fisk University, he assumed his present position as Callaway Professor of Physics at Clark Atlanta 
University. His main research interests are determining asymptotic properties of the solutions to 
difference equations, discrete modeling of continuous systems, nonlinear oscillations, and the history of 
science. 


WAYNE ROBERTS has ‘been a member of the MAA committee, Calculus Reform and the First Two 
Years (CRAFTY) over the entire decade of calculus reform, and was Chair of that committee from 
1992 to 1995. He is editor of the new CRAFTY book, Calculus: The Dynamics of Change and he 
directed the project that developed the five-volume Resources for Calculus published by the MAA. A 
member of the Macalester College faculty for thirty years, and for eight years Chair of the Department 
of Mathematics and Computer Science, Roberts was appointed Provost of the College in July 1995. He 
was awarded the MAA North Central Section Citation for Meritorious Service in 1996. 
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EDITOR’S ENDNOTES 


This month’s four-article mini-forum on calculus reform addresses important issues 
of concern to everyone involved in collegiate mathematics. We are grateful to Jeff 
Knisley, Paul Zorn, Arnold Ostebee, James Kaput, and Richard Askey for their 
efforts. 

Two recent paperback books contain a wealth of sensible advice for mathemati- 
cal authors: 


Handbook of Writing for the Mathematical Sciences, by Nicholas J. Higham. Society for Industrial 
and Applied Mathematics, Philadelphia, 1993, 241 pp., $21.50. ISBN 0-89871-314-5. 


A Primer of Mathematical Writing, by Steven G. Krantz. American Mathematical Society, 

Providence, 1996, 223 pp., $19.00. ISBN 0-8218-0635-1. 
Members may obtain these books from their respective societies at a discount. 

Readers who have enjoyed filler items containing MONTHLY extracts from 100, 
50, and 25 years ago have Joseph Gallian to thank for their pleasure. Joe reports 
that he selected “short things that were amusing, quaint, or curious. Included are 
problems, book reviews, or simply titles of books reviewed, and announcements of 
various sorts. In the case of the problems, some were chosen because of whom they 
were submitted by (e.g., Pélya, Dickson) while others were chosen to illustrate the 
contrast between the problems in current issues and those from 100 years ago.” 

Paolo Ribenboim reports the following comments about his article in the 
August, 1996 issue [103 (1996) 529-538] contributed by S. W. Golomb: 

The quantitative version of the twin primes conjecture is not quite as stated on p. 531. The 

conjectured asymptotic form of the number of primes p < N such that p + 2 is also prime is 


C,N/(log N)*, where the twin prime constant C, = 1.32032... is a certain infinite product over 
odd primes [MONTHLY 67 (1960) 767-769]. 


According to L. E. Dickson’s History of the Theory of Numbers, de Polignac’s conjecture is not 
merely “that every even number is the difference of two primes” (p. 531), but that every even 
number is a difference of two primes in infinitely many ways. Thus, de Polignac’s conjecture 
includes the twin primes conjecture as a special case. 


Our December, 1996 issue contained an unattributed filler item (see p. 916) 
extracted from the winter, 1991 issue of The Mathematical Intelligencer. George F. 
Simmons points out that original source of this item is the preface to his book 
Differential Equations with Applications and Historical Notes, second edition, 
McGraw-Hill, Inc., New York, 1991. The full quotation (see p. xix) is: 


Some will think that a mathematical argument either is a proof or is not a proof. In the context 
of elementary analysis I disagree, and believe instead that the proper role of a proof is to carry 
reasonable conviction to one’s intended audience. It seems to me that mathematical rigor is like 
clothing: in its style it ought to suit the occasion, and it diminishes comfort and restricts freedom 
of movement if it is either too loose or too tight. 


Leonard Gillman writes the following about Rethinking Rigor in Calculus in the 
March issue [104 (1997) 231-240]: 


Tom Tucker recommends avoiding deus-ex-machina auxiliary functions in proofs, such as a proof 
of the Mean Value Theorem. Of course. Let f and g be continuous on [a, b] and differentiable 
on (a, b), and suppose f and g agree at a and b. Then f — g satisfies the hypotheses of Rolle’s 
Theorem, so there is a point in (a,b) where f’ and g’ agree. The Mean Value Theorem is the 
special case in which the graph of g is a straight line. This argument is in the calculus text by 
Ford and Ford, McGraw-Hill, 1963. 


Roger A. Horn, Editor 
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disciplines. These projects will provide teachers with 
material that can help their students understand 
mathematical concepts, develop strong mathematical 
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Julia a life in mathematics 


Constance Reid 


Constance Reid, an established writer about mathematicians, bas 
written an excellent and loving book, about ber sister julia 
Robinson, the mathematician The author bas written that sbe 
wams the book to be one for all age groups and she bas succeeded 
admirably in making it so..,julia wanted to be known as a mathe- 
matician, not a woman mathematician and rightly so! However, 
she was, and ts, a wonderful role model for women. aspiring to be 
mathematician. What a great gift this book would be! 

—Alice Schafer, Former President, AWM 


This book is a small treasure, one which I want to share with all 
my mathematical friends. The assembly of several articles and 
additional photos and remarks provides the image of a mathe- 
matician of extraordinary taste, tenacity and generosity.... julia 
Robinson broke ground in displaying the deep connections 
between number theory and logic. Her results bave led to a 
very active area today, making the appearance of this book very 
timely. Her work and ber example are however timeless and I 
can think of no better advice to give a young mathematician, 
either in how to do mathematics. or bow to behave in mathe- 
matics, than: “Be like Julia!” 

—Carol Wood, Deputy Director, MSRI 
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Julia is the story of the life of Julia Bowman Robinson, the gift- 


ed and highly original mathematician who during her lifetime 
was recognized in ways that no other woman mathematician 
had been recognized up to that time. In 1976 she became the 
first woman mathematician elected to the National Academy of 
Sciences and in 1983 the first woman elected president of the 
American Mathematical Society. 


This unusual book, profusely illustrated with previously 
unpublished personal and mathematical memorabilia, brings 
together in one volume the prizewinning “Autobiography of 
Julia Robinson” by her sister, the popular mathematical 
biographer Constance Reid, and three very personal articles 
about her work by outstanding mathematical colleagues. 


All royalties from sales of this book will go to fund a Julia 
Robinson Prize in Mathematics at the high school from 
which she graduated. 
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The use of the history of mathematics in the teaching 
of mathematics at all levels is an idea whose time has 
come. To use history in the teaching of undergradu- 
ate mathematics, the instructor must be familiar with 
the history as well as the mathematics. Vita 
Mathematica will enable college teachers to learn the 
relevant history of various topics in the undergradu- 
ate curriculum and help them incorporate this history 
in their teaching. 


For example, should calculus be approached from a 
geometric or an algebraic point of view? The book 
shows us how two important eighteenth century 
mathematicians, Colin Maclaurin and Joseph-Louis 
Lagrange, understood the calculus from these differ- 
ent standpoints and how their legacy is still impor- 
tant in teaching calculus today. We also learn why 
Lagrange’s algebraic approach dominated teaching in 
Germany in the nineteenth century. Some of the rea- 


Vita Mathematica 


Historical Research and Integration with Teaching 


Ronald Calinger, Editor 


sons for this are related to the appropriate founda- 
tions of the calculus, and so the book traces the 
ancient history of one of the possible foundations, 
the concept of indivisibles. Even though we general- 
ly do not use this concept formally today, many ideas 
for a heuristic approach to the calculus can be devel- 
oped out of his study. 


Vita Mathematica contains numerous other articles 
dealing with calculus, with algebra, combinatorics, 
graph theory, and geometry, as well as more general 
articles on teaching courses for prospective teachers. 
This volume, then, demonstrates that the history of 
mathematics is no longer tangential to the mathemat- 
ics curriculum, but in fact deserves a central role. 


Catalog Code: NTE40/JR 
350 pp., Paperbound, 1996, ISBN 0-88385-097-4 
List: $34.95 MAA Member: $29.00 


Phone in Your Order Now! ®& 1-800-331-1622 


Monday — Friday 8:30 am — 5:00 pm 
or mail to: The Mathematical Association of America, PO Box 91112, Washington, DC 20090-1112 


FAX (301) 206-9789 


Shipping and Handling: Postage and handling are charged as follows: USA orders (shipped via UPS): $2.95 for the first book, and $1.00 for each additional book. 
Canadian orders: $4.50 for the first book and $1.50 for each additional book. Canadian orders will be shipped within 10 days of receipt of order via the fastest avail- 
able route. We do not ship via UPS into Canada unless the customer specially requests this service. Canadian customers who request UPS shipment will be billed an 
additional 7% of their total order. Overseas orders: $350 per item ordered for books sent surface mail. Airmail service is available at a rate of $7.00 per book. 
Foreign orders must be paid in US dollars through a US bank or through a New York clearinghouse. Credit Card order are accepted for all customeis 


Address 
City State _. Zip 


Phone 


Ory CATALOG CODE PRICE AMOUNT 
NTE40/JR ; 

All orders must be prepaid with the — Stupping & handling 

exception of books purchased for 

resale by bookstores and wholesalers. TOTAL 

Payment [] Check [] VISA [] MasterCard 

Credit Card No. Expires J 


Signature 


THE MATHEMATICAL ASSOCIATION OF AMERICA 


_ PROBLEM BOOK V 
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Over the years perhaps the most popular of the 
MAA problem books have been the high school 
contest books, covering the yearly American 
High School Mathematics Examinations (AHSME) 
that began in 1950, co-sponsored from the start 
by the MAA. Book V also includes the first six 
years of the American Invitational Mathematics 
Examination (AIME) which was developed as an 
intermediate step between the AHSME and the 
USA Mathematical Olympiad (USAMO). The 
AIME has a unique answer format — all answers 
are integers between 0 and 999. 


The editors of this volume, George Berzsenyi and 
Stephen B Maurer, were respectively the chair of 
the AIME and the AHSME during this period. In 
addition to a thorough index, they have added 
much material not included in Contest Books I-IV: 
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The Contest Problem Book V 


American High School Mathematics Examinations and 
American Invitational Mathematics Examinations, 1983-1988 
Series: New Mathematical Library 


George Berzsenyi and Stephen B Maurer 


* a comprehensive guide to other problem 
materials world wide, 

additional solutions, 

dropped problems, 

statistical information, 

information on test development and history. 


This volume is a must for avid fans of elemen- 
tary problems. 


Contest Books I-IV appear as NML volumes 5, 
17, 25, and 29. 
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Another elegant collection of problems from Ross Honsberger 


The study of mathematics is often undertaken with represented —- combinatorics, geometry, number 
an air of such seriousness that it doesn’t always seem __ theory, algebra, probability, ..... The sections may be 
to be much fun at the time. However, it is quite read in any order. The book concludes with twenty- 
amazing how many surprising results and brilliant five exercises and their detailed solutions. 
arguments one is in a position to enjoy with just a 
high school background. This is a book of Something to delight will be found in every section 
miscellaneous delights, presented not in an attempt — a surprising result, an intriguing approach, a stroke 
to instruct but as a harvest of rewards that are due of ingenuity — and the leisurely pace and generous 
good high school students and, of course, those explanations make them a pleasure to read. 
more advanced — their teachers, and everyone in 
the university mathematics community. Admittedly, The inspiration for many of the problems came from 
they take a little concentration, but the price is a the Olympiad Corner of Crux Mathematicorum, 
bargain for such gems. published by the Canadian Mathematical Society. 
A half dozen essays are sprinkled among some Catalog Code: DOL-19/JR 
hundred problems, most of which are the easier 328 pp., Paperbound, 1997 - 
problems that have appeared on various national and ISBN 0-88385-326-4 
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The MAA is proud to reissue Martin 
Gardner’s Penrose Tiles to Trapdoor Ciphers, 
printed with a new bibliography, correc- 
tions to the text, and a postscript from the 
author. Penrose Tiles assembles a collection 
of Gardner’s “Mathematical Games” columns 
from Scientific American that include many 
of the problems, puzzles and paradoxes 
that have earned him a reputation as a 
master mathematical magician. 


Included here are chapters on Conway’s 
surreal numbers, Mandelbrot’s fractals, and 
Smullyan’s logic puzzles, as well as puzzlers 
dealing with hyperbolas, negative numbers, 
pool-ball triangles, and Penrose tiles and 
trapdoor ciphers. And of course, you can 
read of the return of Dr. Irvine Joshua 
Matrix, (famed numerologist and CIA 
operative), one of Martin Gardner’s oldest 
fictional friends. 


Monday — Friday 8:30 am — 5:00 pm 


Phone in Your Order Now! ®& 1-800-331-1622 


Penrose Tiles to 
Trapdoor Ciphers 


... and the Return of Dr. Matrix 


MARTIN GARDNER 
A reissuc of another Gardner classic 
Series: Spectrum 


Read what reviewers have said about Penrose Tiles to Trapdoor 
Ciphers ... 


The scope ts extraordinary ... Those fortunate enough to bave 
encountered Gardner's columns in their original appearance 
can look for personal bonuses of reminiscence as they read this 
book ... Gardner is one of history's great figures of recreational 
mathematics. —New Scientist 


Penrose liles to Trapdoor Ciphers ts invaluable to those interested 
in recreational mathematics and should enlighten those who 
consider such activity to be difficult or boring. 

—The Mathematics Teacher. 


No popular mathematical writer has ever matched Gardner's 
breadth and richness of knowledge and clarity of style, and this 
book is up to his usual unsurpassable standard. 

—American Scientist 
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American Mathematical Society 


On Being a Department 
Head, a Personal View 


John B. Conway, University of 
Tennessee, Knoxville 


... an interesting and often humorous look 
at academic leadership ...reads like a work 
written by someone who truly has found 
his calling ... 

—Academic Leader 


For years, higher education prospered. It 
loudly proclaimed that college graduates 
command far greater lifetime incomes. 
Ample funding followed. We produced. But 
that argument has begun to sour. A college 
degree has long since stopped being a guar- 
antee of prosperity or even job security. 
Society has begun to question its support of 
universities. In this environment, mathe- 
maticians and all academics must begin to 
change, compete, and seek resources that 
will be used with greater care. It is the only 
solution if we hope to maintain the 
integrity of the enterprise ... 


—from the Preface 


This unique book presents a witty, 
well-written personal view about the 
experience of being a department head. 
Those in academia will profit from the 
author’s inside view, and other depart- 
ment heads and chairs—new and old— 


For a limited time, you can enjoy additional 
savings on these bestselling books and other 
selected titles when you order online via the 
AMS Bookstore. The bookstore now includes 
the entire backlist of AMS titles— over 2300 
books in print! Go to www.ams.org/bookstore 
and take advantage now of these Web-only 
savings (valid until December |, 1997). 


will benefit from the experiences of this 
keenly observant colleague. 
1996, reprinted 1997; 107 pages; Softcover; 


ISBN 0-8218-0615-7; List $24; All AMS members 
$19; Order code AHEADMM710 


The Way I Remember It 


Walter Rudin, University of 
Wisconsin, Madison 


Walter Rudin’s memoirs should prove 
to be a delightful read specifically to 
mathematicians, but also to historians 
who are interested in learning about 
his colorful history and ancestry. 
Characterized by his personal style of 
elegance, clarity, and brevity, Rudin 
presents in the first part of the book his 
early memories about his family his- 
tory, his boyhood in Vienna throughout 
the 1920s and 1930s, and his experi- 
ences during World War II. 


Part II offers samples of his work, in 
which he relates where problems came 
from, what their solutions led to, and 
who else was involved. As those who 
are familiar with Rudin’s writing will 
recognize, he brings to this book the 
same care, depth, and originality that is 
the hallmark of his work. 

1997; reprinted 1997; 191 pages; Softcover; ISBN 


0-8218-0633-5; List $29; All AMS members $23; 
Order code HMATH/12MM710 


= www.ams.org/bookstore 


SPRINGER FOR MATHEMATICS 


» New 
GERARD BUSKES, University of Mississippi, MS and 
A. VAN ROOW, Catholic University of Niimegen, The Netherlands 


TOPOLOGICAL SPACES 


From Distance to Neighborhood 


This is a gentle introduction to the 
theory of topological spaces, lead- 
ing the reader to understand what 
is important in topology vis-a-vis 
geometry and analysis. The authors 
have carefully divided the book 
into three sections, The Line and 
the Plane, Metric Spaces, and 
Topological Spaces—in order to 
mitigate the move into higher lev- 
els of abstraction. Students are 
thereby informally assisted in getting acquainted with new 
ideas while remaining on familiar territory. The authors have 
also restricted the mathematical vocabulary in the book to 
avoid overwhelming the reader with the extensive array of 
technical terms that indicate the properties of topological 
spaces. Additionally, the notion of convergence is employed 
to allow students to focus on a central theme while mov- 
ing to a natural understanding of the notion of topology. 
The pace of the book is relaxed with a gradual accelera- 
tion. The initial pace makes the first nine sections a bal- 
anced course in metric spaces, while allowing ample 
material for a two-semester graduate class. A balanced selec- 
tion of carefully crafted exercises complements the book. 
The authors do not assume any previous knowledge of 
axiomatic approach or set theory. 

1997 /328 PP., 151 ILLUS./HARDCOVER/$39.95 


ISBN 0-387-94994-1 
UNDERGRADUATE TEXTS IN MATHEMATICS 


» New 
OLAV KALLENBERG, Auburn University, AL 


FOUNDATIONS OF 
MODERN PROBABILITY 


This book is unique for its broad and yet comprehensive 
coverage of modern probability theory, ranging from first 
principles and standard textbook material to more advanced 
topics. In spite of the economical exposition, careful proofs 
are provided for all main results. After a detailed discus- 
sion of classical limit theorems, martingales, Markov 
chains, random walks, and stationary processes, the author 
moves onto a modern treatment of Brownian motion, Lévy 
processes, weak convergence, Ito calculus, Feller pro- 
cesses, and SDEs. The more advanced parts include mate- 
rial on local time, excursions, and additive functionals, 
diffusion processes, PDEs and potential theory, predictable 
processes, and general semi-martingales. Though primar- 
ily intended as a general reference for researchers and grad- 
uate students in probability theory and related areas of 
analysis, the book is also suitable as a text for graduate and 
seminar courses on all levels, from elementary to advanced. 
Numerous easy to more challenging exercises are provided. 
1997 /APP. 528 PP./HARDCOVER/$62.95 


ISBN 0-387-94957-7 
PROBABILITY AND ITS APPLICATIONS 


Order Today! 


GARY CORNELL, University of Connecticut, Storrs, CT; 
GLENN STEVENS, Boston University, MA and 
JOSEPH H. SILVERMAN, Brown University, Providence, RI 


MODULAR FORMS AND 
FERMAT’S LAST THEOREM 


| This volume contains expanded 
versions of lectures given at a con- 
ference on number theory and arith- 
metic geometry held August 9-18, 
1995 at Boston University. The 
purpose of the conference, and of 
this book, is to introduce and 
explain the many ideas and tech- 
niques used by Wiles in his proof 
that every (semi-stable) elliptic 
curve over Q is modular, and to 
explain how Wiles’ result can be combined with Ribet’s 
theorem and ideas of Frey and Serre to show, at long last, 
that Fermat’s Last Theorem is true. Contributors to this vol- 
ume include: B. Conrad, H. Darmon, E. de Shalit, B. de 
Smit, F. Diamond, S.J. Edixhoven, G. Frey, S. Gelbart, K. 
Kramer, H.W. Lenstra, Jr., B. Mazur, K. Ribet, D.E. 
Rohrlich, M. Rosen, K. Rubin, R. Schoof, A. Silverberg, 
J.H. Silverman, P. Stevenhagen, G. Stevens, J. Tate, J. 
Tilouine, and L. Washington. 


1997/APP. 616 PP./HARDCOVER/$49.95 
ISBN 0-387-94609-8 


» New 
JOHN STILLWELL, Monash University, Clayton, VICTORIA and 
JAMES A. YORKE, University of Maryland, College Park 


NUMBERS AND GEOMETRY 


Numbers and Geometry is a beautiful and relatively ele- 
mentary account of a part of mathematics where three main 
fields—algebra, analysis and geometry—meet. The aim of 
this book is to give a broad view of these subjects at the 
level of calculus, without being a calculus (or a pre-cal- 
culus) book. Its roots are in arithmetic and geometry, the 
two opposite poles of mathematics, and the source of his- 
toric conceptual conflict. The resolution of this conflict, 
and its role in the development of mathematics, is one of 
the main stories in the book. Stillwell has chosen an array 
of exciting and worthwhile topics and elegantly combines 
mathematical history with mathematics. He believes that 
most of mathematics is about numbers, curves and func- 
tions, and the links between these concepts can be suggested 
by a thorough study of simple examples, such as the cir- 
cle and the square. This book covers the main ideas of 
Euclid—geometry, arithmetic and the theory of real num- 
bers, but with 2000 years of extra insights attached. 
Numbers and Geometry presupposes only high school 
algebra and therefore can be read by any well prepared stu- 
dent entering college. This book will be popular with grad- 
uate students and researchers because it is such an attractive 
and unusual treatment of fundamental topics. Also, it will 
serve admirably in courses aimed at giving students from 
other areas a view of some of the basic ideas in mathematics. 
There is a set of well-written exercises at the end of each 
section. 
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NOTICE TO AUTHORS 


The MONTHLY publishes articles, as well as notes and 
other features, about mathematics and the profes- 
sion. Its readers span a broad spectrum of mathe- 
matical interests, and include professional mathe- 
maticians as well as students of mathematics at all 
collegiate levels. Authors are invited to submit articles 
and notes that bring interesting mathematical ideas 
to a wide audience of MONTHLY readers. 


The MONTHLY’s readers expect a high standard of 
exposition; they expect articles to inform, stimulate, 
challenge, enlighten, and even entertain. MONTHLY 
articles are meant to be read, enjoyed, and dis- 
cussed, rather than just archived. Articles may be 
expositions of old or new results, historical or bio- 
graphical essays, speculations or definitive treat- 
ments, broad developments, or explorations of a sin- 
gle application. Novelty and generality are far less 
important than clarity of exposition and broad appeal. 
Appropriate figures, diagrams, and photographs are 
encouraged. 


Notes are short, sharply focussed, and possibly infor- 
mal. They are often gems that provide a new proof of 
an old theorem, a novel presentation of a familiar 
theme, or a lively discussion of a single issue. 


Articles and Notes should be sent to the Editor: 


ROGER A. HORN 

1515 Mineral Square, Room 142 
University of Utah 

Salt Lake City, UT 84112 


Please send your email address and 3 copies of the 
complete manuscript (including all figures with cap- 
tions and lettering), typewritten on only one side of 
the paper. In addition, send one original copy of all 
figures without lettering, drawn carefully in black ink 
on separate sheets of paper. 


Letters to the Editor on any topic are invited; please 
send to the MONTHLY’s Utah office. Comments, criti- 
cisms, and suggestions for making the MONTHLY 
more lively, entertaining, and informative are wel- 
come. 


See the MONTHLY section of MAA Online for current 
information such as contents of issues, descriptive 
summaries of forthcoming articles, tips for authors, 
and preparation of manuscripts in TEX: 


http: // www.maa.org / 
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Mathematics, Statistics, and Teaching 


George W. Cobb and David S. Moore 


How does statistical thinking differ from mathematical thinking? What is the role 
of mathematics in statistics? If you purge statistics of its mathematical content, 
what intellectual substance remains? 

In what follows, we offer some answers to these questions and relate them to a 
sequence of examples that provide an overview of current statistical practice. 
Along the way, and especially toward the end, we point to some implications for 
the teaching of statistics. 


1. INTRODUCTION: AN OVERVIEW OF STATISTICAL THINKING. Statistics 
is a methodological discipline. It exists not for itself but rather to offer to other 
fields of study a coherent set of ideas and tools for dealing with data. The need for 
such a discipline arises from the omnipresence of variability. Individuals vary. 
Repeated measurements on the same individual vary. In some circumstances, we 
want to find unusual individuals in an overwhelming mass of data. In others, the 
focus is on the variation of measurements. In yet others, we want to detect 
systematic effects against the background noise of individual variation. Statistics 
provides means for dealing with data that take into account the omnipresence of 
variability. 


1.1. The role of context. The focus on variability naturally gives statistics a particu- 
lar content that sets it apart from mathematics itself and from other mathematical 
sciences, but there is more than just content that distinguishes statistical thinking 
from mathematics. Statistics requires a different kind of thinking, because data are 
not just numbers, they are numbers with a context. 


Example 1. The mystery of Andover. The finite sequence (3, 5, 23, 37, 6, 8, 20, 22, 1, 3) 
shows a distinctive pattern when plotted (Figure 1) but the numbers and the 
pattern have no meaning or interest until we know their context. They are in fact 
monthly totals of people formally accused of witchcraft in Essex County, 
Massachusetts, beginning in February, 1692. The plot shows two waves of accusa- 
tions, separated by a low point in the summer of 1692. The pattern becomes still 
more meaningful when we know that the first hanging of a convicted witch 
(Bridget Bishop) took place June 10, 1692: it is not hard to imagine the sobering 
effect of that first execution in the small community of Salem Village (now 
Danvers). But why the second wave of accusations? It turns out that the accusa- 
tions in the first wave were directed against residents of Salem Village, Salem 
Town, and all but one of the half-dozen immediately adjacent towns; in the second 
wave the majority of the accusations were directed against residents of the one 
other adjacent town, Andover. Our sources [3, 4] do not provide much explanation 
for what happened in Andover, but the pattern, together with what we know of the 
context, tells at least part of a story and raises some interesting questions. 
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Figure 1. Numbers of people accused of witchcraft in Essex County, MA, 1692. 


Although this first example has almost no mathematical content, its interplay 
between pattern and context is typical of the interpretive part of statistical 
thinking. For a more familiar example of a very different sort, consider testing that 
two normal distributions have equal means. 


Example 2a. A model for comparing normal means. Consider the standard model 
involving two sets of independent, identically distributed (iid) random variables: 


X,,X>,..-,X, lid N( p,, o;) Y,,Y5,.--,¥,, lid N( Mu, o>) 


m 
It follows that x = (2x,)/n and s? = 3(x, — X)*/(n — 1) are sufficient statistics 
for w, and o/, with parallel results for the Ys. Informally, a statistic is sufficient 
for a parameter if it uses all the information about that parameter contained in the 
sample. More formally, the conditional distribution of the data, given the sufficient 
statistic, doesn’t depend on the parameter. The Rao-Blackwell Theorem guaran- 
tees that no unbiased estimator can have a smaller variance than one based on a 
sufficient statistic. Both ¥ and s? are unbiased: E(%) = uw, and E(sj) = o/. 
Finally, their joint distribution is known: the sample mean x is normal with 
variance oj /n, and, independently, (n — 1)s?/a/ is chi-square on (n — 1) de- 
grees of freedom. Suppose now we want to test Hj: w, = My. If of = of thena 
sufficient and unbiased estimator for the common variance is obtained by pooling: 


s,= [(m — 1)s? + (m — 1)83|/(n + m — 2) 


If H, is true, then (x — y)/sy¥(1/n) + (1/m) has a Student’s ¢-distribution on 
n +m — 2 degrees of freedom, and we can use the value of t computed from the 
data to test the null hypothesis. If ¢ is far enough from 0, we conclude that 
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This example differs most strikingly from the first in two ways: mathematical 
content and the role of context. Example 1, which has essentially no mathematical 
content, finds its intellectual substance almost entirely in the interplay between 
pattern and story. Example 2, which has essentially no content apart from mathe- 
matics, gets it intellectual substance without any explicit reference to applied 
context. 

Although mathematicians often rely on applied context both for motivation and 
as a source of problems for research, the ultimate focus in mathematical thinking 
is on abstract patterns: the context is part of the irrelevant detail that must be 
boiled off over the flame of abstraction in order to reveal the previously hidden 
crystal of pure structure. In mathematics, context obscures structure. Like mathe- 
maticians, data analysts also look for patterns, but ultimately, in data analysis, 
whether the patterns have meaning, and whether they have any value, depends on 
how the threads of those patterns interweave with the complementary threads of 
the story line. In data analysis, context provides meaning. 

The difference has profound implications for teaching. To teach statistics well, 
it is not enough to understand the mathematical theory; it is not even enough to 
understand also the additional, non-mathematical theory of statistics. One must, 
like a teacher of literature, have a ready supply of real illustrations, and know how 
to use them to involve students in the development of their critical judgment. In 
mathematics, where applied context is so much less important, improvised exam- 
ples often work well, and teachers of mathematics become skillful at inventing 
examples on the spot. (Need a function to illustrate the chain rule? No problem: 
just make one up.) In statistics, however, improvised examples don’t work, because 
they don’t provide authentic interplay between pattern and context. Much as 
Bertrand Russell likened mathematics to sculpture for the austerity of its abstrac- 
tion, one might think of data analysis as like poetry, where pattern and context are 
inseparable. Imagine yourself teaching a lesson on basic prosody, introducing 
dactylic hexameter. It is not enough to say “TA ta ta, TA ta ta, TA ta ta,...;” your 
students need to hear dactyls in a real poem [20]: “This is the forest primeval. The 
murmuring pines and the hemlocks.” In a similar spirit, the teacher of statistics 
needs to know the data literature. If, for example, when you teach plots for data 
distributions, you use data on inter-eruption times for Old Faithful [30] and lengths 
of reigns of English kings and queens [13], your students can learn more than just 
the methods themselves. The bimodal shape of the inter-eruption times suggests 
two kinds of eruptions, and the distribution of monarchs’ reigns shows the 
skewness toward high values that is typical of waiting times. 

The contrasting roles of context in mathematics and statistics, especially as 
illustrated in the deliberately extreme first two examples, might seem to lend 
support to the false implication in Bullock’s [5] assertion that ‘Many statisticians 
now claim that their subject is something quite apart from mathematics, so that 
statistics courses do not require any preparation in mathematics.” In fact, while we 
find the evidence that statistics is not mathematics persuasive (see [22], [24)), all 
statistics courses require some preparation in mathematics, and some require a 
great deal. Elaborate mathematical theories undergird some parts of statistics, and 
the study of those theories is part of the training of statisticians. But although 
statistics cannot prosper without mathematics, the converse fails. That statistics is 
not a necessary part of a mathematician’s training is implicit in the statement by 
the eminent probabilist David Aldous [1] that he “is interested in the applications 
of probability to all scientific fields except statistics.” 
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What then, is the role of mathematics in the science of statistics? An answer 
should begin with a more systematic look at the logic of analyzing data. 


1.2. A schematic overview of statistical analysis. An old-style course that wanted 
to be conscientious about applications might finish off the second example with a 
little coda of an exercise. The data, although not this invented exercise, are from 
[25]; the full study is described in [21]. 


Example 2b. Calcium and blood pressure. Does increasing the amount of calcium in 
our diet reduce blood pressure? The following numbers give the decrease after 12 
weeks in systolic blood pressure for 21 human subjects. The 10 subjects in Group 1 
took a calcium supplement for 12 weeks; the 11 in Group 2 took a placebo. Test 
the hypothesis that the calcium had no effect on blood pressure. 


Group 1 (calcium): 7, —4, 18, 17, —3, —5, 1, 10, 11, —2 
Group 2 (placebo): — 1,12, —1, —3,3, —5,5,2, —11, —1, -3 


This exercise, put so tersely, is a caricature, one that encourages the mistaken view 
that once the mathematical derivations from a model are completed, applications 
are largely a matter of routine arithmetic. For a more realistic perspective, 
consider Figure 2, a diagram of the stages in a statistical analysis. Before consider- 
ing this crude outline in detail, two cautions are essential. 


— G) (2) 
Design --> Data --> Patterns 
Model(s) --> Methods --> Results --> Intrepretation 
(3) (4) (5) 


Figure 2. A schematic representation of the phases of data production and analysis. 


1. The summary oversimplifies by suggesting a strict left-to-right progression. In 
reality, the process of data analysis is neither linear not unidirectional. 
Several transitions involve a dialog of sorts, sometimes between adjacent 
elements, but sometimes among more than just two. Thus, for example, the 
choice of design for data production determines the structure of the resulting 
data, but knowledge based on data already in hand can help shape the 
design, as when knowing the size of variation from one subject to another 
helps decide how many subjects will be needed. Similarly, the data may 
suggest a model, but the model leads to methods that send us back to the 
data to check for possible violations of the model’s assumptions. Perhaps 
most important of all, as we shall see, the final stage, interpretation of the 
results, depends in a crucial way on the first stage, the kind of design used 
for producing the data. 

2. The rough and qualified ordering of stages here is not meant to suggest that 
we think the topics taught in an introductory statistics course should follow 
the same order. For reasons presented later, we recommend beginning with 
methods for exploring and describing data, then going “back” to data 
production, and from there to formal inference. 


With these cautions assumed, the flowchart can provide a useful framework for 
examining the role of mathematics in statistics and summarizing elements of the 


804 MATHEMATICS, STATISTICS, AND TEACHING [November 


non-mathematical substance of the subject. Here are four quick observations: 


1. Design, exploration, and interpretation are core elements of statistical think- 
ing. All three elements are heavily dependent on context, but at the introduc- 
tory level they involve very little mathematics. The (largely non-mathemati- 
cal) theory of experimental design is decades old and well developed; the 
theory of exploration is newer, and at present still primitive, although 
computer-based tools for exploration have become quite sophisticated; the 
theory of interpretation is fragmentary at best. 

2. The classical course in mathematical statistics corresponds so neatly to 
transition (3) that “from models to methods” might almost serve as a course 
title. Context is largely irrelevant here, because models are presented ab- 
stractly, as in Example 2a, and a typical derivation simply applies one 
optimality principle or another (least squares, maximum likelihood) to de- 
duce the method de jour. 

3. Transition (4), from methods to results, is the focus of the old-style cookbook 
course, in which each method is summarized by a set of formulas. Context is 
irrelevant here also, in that you can learn computational alzorithms, and in 
fact learn them more efficiently, if you resist any temptation to encumber 
your brain with concern about what the methods are good for. All the same, 
some courses have tried to make the throat-clogging bolus of rote easier to 
get down by sugar-coating it with a thin glaze of ersatz context. Fortunately, 
the computer is fast sweeping courses like these into the dustbin of curricular 
history. 

4. It is perhaps ironic that transitions (3) and (4), the two that have most often 
been the focus of courses at the introductory level, are precisely the two that 
are intellectually most automatic (given our current limited understanding 
and less developed theory of the other transitions) and so offer the least 
room for judgment and creativity. 


To develop these points in more detail, we return to the example of calcium and 
blood pressure. In what follows, we combine the stages of Figure 2 under three 
broader headings: data production, data analysis, and inference. 


2. THE CONTENT OF STATISTICS 


2.1. Data production. The standard model of Example 2a is incomplete in a most 
serious way: it does not distinguish between observational data (e.g., from a sample 
survey) and data from a randomized comparative experiment. This distinction, 
between observation and experiment, is one of the most important in statistics. 
Researchers often want to reach causal conclusions: calcium causes a reduction in 
blood pressure. Experiments often allow causal conclusions, while observational 
studies almost always leave issues of causation unsettled and subject to debate. Yet 
the mathematical models of statistical theory are identical for observational and 
experimental data. 
The calcium study was in fact an experiment: 


Example 2c. The design of the calcium study [21]. Examination of a large sample of 
people revealed a relationship between calcium intake and blood pressure. The 
relationship was strongest for black men. Researchers therefore conducted an 
experiment. 
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The subjects in part of the experiment were 21 healthy black men. A randomly 
chosen group of 10 of the men received a calcium supplement for 12 weeks. The 
control group of 11 men received a placebo pill that looked identical. The 
experiment was double-blind. 


Can we conclude that calcium has caused a reduction in blood pressure? Such 
an inference, that an observed difference may be taken at face value, stands on 
three legs. Two of the three are grounded in data production: 


(1) an argument—automatic only for random samples and randomized experi- 
ments—that a probability model applies to the data; 

(2) an argument—probability-based, and comparatively straightforward—that 
the observed difference is “real,” 1.e., too big to be plausibly explained as 
due just to chance variation; and 

(3) an argument—often thorny and fraught with pitfalls, except in the case of 
randomized experiments—that the observed difference is not due to some 
confounding influence distinct from the factor of interest. 


The t-test of Example 2a, like all statistical tests and confidence intervals, deals 
only with the second argument: “If we assume that a particular chance model 
applies, how likely is it to get an observed difference this big?” The other two 
arguments depend on the design. 

The clinical trial on the effect of calctum on blood pressure was a randomized 
comparative experiment. Figure 3 presents the design in outline form. The great 
virtue of assigning the subjects at random is that it makes arguments (1) and (3) 
automatic, and so reduces the problem of inferring cause to checking the fit of a 
model, and then, given adequate fit, carrying out a straightforward calculation. The 
random assignment of subjects eliminates bias in forming the treatment groups and 
produces groups that differ only through chance variation before we apply the 
treatments. The comparative design reminds us that all subjects are treated exactly 
alike except for the contents of the pills they take. Thus if we observe differences 
in the mean reduction in blood pressure greater than could be expected to arise by 
chance, we can be confident that the calctum brought about the effect we see. 


Group 1 Treatment 1 
7 10 patients Calcium N, 
Random Compare 


Allocation Blood Pressure 
™ Group 2 


; _ Treatment 2 
11 patients Placebo 


Figure 3. The simplest randomized comparative experiment. 


The other major means of producing data are sample surveys that choose and 
examine a sample in order to produce information about a larger population. 
Interesting examples abound—opinion polls sound and unsound, government 
collection of economic and social data, academic data sources such as the National 
Opinion Research Center at the University of Chicago. Statistical designs for 
sampling begin by insisting that impersonal chance should choose the sample. The 
central idea of statistical designs for producing data, through either sampling or 
experimentation, is the deliberate use of chance. Explicit use of chance mecha- 
nisms eliminates some major sources of bias. It also ensures that quite simple 
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probability models describe our data production processes, and therefore that 
standard inference methods apply. However, unlike randomized experiments, 
observational studies do not lend themselves in so straightforward a way to an 
inference of causation, as the following example shows. The original study by Best 
and Walker appears as an example in [12]; our presentation here follows [26]. 


Example 3. Smoking and health. One of the early observational studies of smoking 
and health compared mortality rates for three groups of men. The rates, in deaths 
per year per 1000 men, were: 


Non-smokers 20.2, Cigarette smokers 20.5, Cigar and pipe smokers 35.5. 


To test whether the observed differences might be due to chance, we could use a 
model similar to the one in Example 2a. The sample sizes were so large that we 
can easily rule out chance variation as an explanation for the observed differences, 
leaving us with the apparent conclusion that cigarettes pose little risk but pipes or 
cigars or both are quite dangerous. Indeed, that conclusion would be valid if these 
data had come from a randomized, controlled double-blind experiment like the 
calcium study. However, the premise is clearly untenable. Because this is an 
observational study, we need to ask about other factors, linked to smoking habits, 
that might be responsible for the observed difference. Here, age is the main such 
factor: pipe and cigar smokers tend to be older than cigarette smokers, and the 
risk of death increases with age. In this study, the average ages for the three 
groups were: 


Non-smokers 54.9 years, Cigarette smokers 50.5 years, 
Cigar and pipe smokers 65.9 years. 


Only after adjusting the death rates for the differences in age do we get numbers 
more in line with what we have come to expect: 


Non-smokers 20.3, Cigarette smokers 28.3, Cigar and pipe smokers 21.2. 


Taken together, the last two examples offer what we consider two of the most 
important lessons for mathematicians who teach statistics: one, the conclusions 
from a study depend crucially on how the data were produced, and two, the 
standard mathematical models ignore data production. 

Statistical ideas for producing data to answer specific questions are the most 
influential contributions of statistics to human knowledge. Badly designed data 
production is the most common serious flaw in statistical studies. Well designed 
data production allows us to apply standard methods of analysis and reach clear 
conclusions. Professional statisticians are paid for their expertise in designing 
studies; if the study is well designed (and no unanticipated disaster occurred), you 
don’t need a professional to do the analysis. In other words, the design of data 
production is really important. If you just say “Suppose X, to X, are iid 
observations,” you aren’t teaching statistics. 


2.2. Data analysis: exploration and description. Data analysis is the contemporary 
form of “descriptive statistics,’ powered by more numerous and more elaborate 
descriptive tools, but especially by a philosophy due in large measure to John 
Tukey of Bell Labs and Princeton. The philosophy is captured in the now-common 
name, exploratory data analysis, or EDA. The goal of EDA is to see what the data 
in hand say, on the analogy of an explorer entering unknown lands. We put aside 
(but not forever) the issue of whether these data represent any larger universe. 
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Table 1 presents an elementary summary [25] of the distinctions between EDA and 
standard inference: 


TABLE 1. EXPLORATORY DATA ANALYSIS VS. FORMAL PROBABILITY-BASED INFERENCE 


Exploratory Data Analysis Statistical Inference 


Purpose is unrestricted exploration Purpose is to answer specific 
of the data, searching for questions, posted before the 
interesting patterns. data were produced 


Conclusions apply only to the Conclusions apply to a larger group 
individuals and circumstances for of individuals or a broader class 
which we have data in hand of circumstances 


what we see in the data. a statement of our confidence in them 

In practice, exploratory analysis is a prerequisite to formal inference. Most 
real data contain surprises, some of which can invalidate or force modification of 
the inference that was planned. This is one reason why running data through a 
sophisticated (and therefore automated) inference procedure before exploring 
them carefully is the mark of a statistical novice. The dialog between data and 
models continues with more advanced diagnostic tools that allow data to criticize 
specific models. These tools combine the EDA spirit with the results of mathemat- 
ical analysis of the consequences of the models. 

As we have already seen, the model of Example 2a, because it does not 
distinguish between observation and experiment, is incomplete. It is also, like most 
idealized mathematical models for real phenomena, unrealistic. In the words 
attributed to the statistician George Box, “All models are wrong, but some are 
useful.” The user of inference methods based on this model must carefully explore 
its adequacy to the setting and the data. Were there flaws in the data production 
(whether sample or experiment) that render inference meaningless? Are the data, 
which are certainly not independent observations on a perfectly normal distribu- 
tion, sufficiently normal to allow use of standard procedures? This question is 
answered by exploratory examination of the data themselves, combined with 


knowledge of how “robust” the planned analysis is under deviations from the 
assumptions of the model. 


Example 2d. Preliminary exploration of the calcium data. An analysis might start 
from a simple outline: plot, shape, center, spread. 


Plot. A stemplot splits each data value into a stem and leaf, then sorts leaves onto 
shared stems. Figure 4 shows a back-to-back stemplot useful for comparing two 
groups: 


Placebo Calcium 
1} -1 
5; -01]5 
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Figure 4. Parallel stemplot of reduction in systolic blood pressure for two groups of men. 
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Shape. The distribution for the placebo group is unimodal and symmetric. The 
treatment group, however, contains a faint suggestion of bimodality, which raises 
the possibility of two kinds of subjects. Might there be some who respond to 
calcium, and others who do not? There is no way to tell from these data, but the 
possibility is worth noting. 


Center and spread. A useful plot for comparing centers, spreads and symmetries is 
the boxplot (Figure 5). Each box locates the quartiles and median of a distribution; 
the “whiskers” extend from the quartile to the most extreme points within 1.5 
interquartile ranges of the nearest quartile, and points at a greater distance from 
the median are shown separately. Here we find a difference in medians, but also a 
pronounced difference in spreads, one that should raise suspicions about the 
assumption of equal variances used to justify a pooled estimate in Example 2a. 


NO 
om) 


) 


Decrease in systolic blood pressure 


Placebo Calcium 


Figure 5. Parallel boxplots of reduction in systolic blood pressure for two groups of men. 


Normal quantile plot. Looking ahead to a t-test to compare means, it is prudent to 
ask whether the data give us reason to question the normal model of Example 2a. 
Here we subtract the group mean from each observation to get residuals, then plot 
the ordered residuals against the corresponding quantiles of a normal distribution; 
see Figure 6. Our ordinates are the 21 ordered residuals, which divide the real line 
into 22 sub-intervals. The corresponding abscissas are the 21 values that divide the 
real line into 22 segments that are equiprobable under the normal model. If the 
data come from a single normal distribution, we can expect the points to fall near a 
line. 
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Figure 6. Normal quantile plot for the blood pressure data. 


For the calcium data, the pattern is reasonably linear, although the vertical 
jump before the three right-most points shows observed residuals that are larger 
than predicted by the normal model, a pattern consistent with the unequal spreads 
in the boxplots. 

Mathematically structured instruction, which tends to emphasize how methods 
follow from models, often provides only the most general warnings about the 
realities of practice. Statistics in practice resembles a dialog between models and 
data. Models for the process that produced our data do indeed play a central role 
in statistical inference. The mathematical exploration of properties and conse- 
quences of models is therefore important (as it is in economics and physics). But 
the data are also allowed to criticize and even falsify proposed models. In the 
calcium examples, the exploratory analysis warns us not to rely heavily on the 
assumption of equal variances, and to use a modified t-test that estimates separate 
variances for the two groups. We can modify Box’s dictum into a practical version 
of the statement that statistics is not just mathematics: Mathematical theorems are 
true; Statistical methods are sometimes effective when used with skill. 

Wide availability of cheap computing, especially graphics, has combined with 
the desire to “let the data speak” to generate an abundance of new tools: at the 
low end we have the stemplots and boxplots of Example 2c; but there are also 
model-free scatterplot smoothers, resistant regression algorithms, clever ideas for 
display of high-dimensional data on two-dimensional screens, and many still more 
advanced diagnostic tools for specific situations. Standard statistical software 
implements much of this. The books [7] and [9], by Bell Labs scientists influenced 
by Tukey, present much of the basic graphical material. The software packages S 
and S-PLUS, which originated at Bell Labs, implement more of the new graphics 
and also implement several new classes of models. See [8] for detailed discussion of 
the latter. 
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Although it may be tempting for the neophyte to view data analysis as merely a 
collection of clever tools, the value of these tools comes from using them in a 
systematic way, according to strategies that organize the examining of data: 


1. Proceed from simple to complex: first examine each variable individually, 
then look at relationships among them. 

2. Use a hierarchy of tools: first plot the data, then choose appropriate 
numerical descriptions of specific aspects of the data, then if warranted 
select a compact mathematical model for the overall pattern of the data. 

3. Look at both the overall pattern and at any striking deviations from that 
pattern. 


It is part of the unifying (but non-mathematical) theory of EDA that these 
principles apply in each of several settings. Given data on a single quantitative 
variable, we might display the distribution by a stemplot, note that it reasonably 
symmetric, calculate the mean and standard deviation as numerical summaries, 
and use a normal quantile plot to see whether a normal distribution is a suitable 
compact model for the overall pattern. Given two quantitative variables, we draw a 
scatterplot, measure the direction and strength of linear association by the correla- 
tion, and, if warranted, use a fitted straight line as a model for the overall pattern. 
Thus the univariate “Plot, shape, center, spread,’ returns in the context of bivariate 
data as “Plot, shape, direction, strength.” 

Here, as elsewhere, an analysis is not just a search for patterns, but a search for 
meaningful patterns. The best fit is not necessarily the most useful, as the following 
example illustrates. 


Example 3. Dormitories and cities. Each point in Figure 7 represents one of the 50 
U.S. states with horizontal coordinate equal to the state’s urban population, and 
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Figure 7. Scatterplot of dormitory population versus urban population for the 50 U.S. states. 
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vertical coordinate equal to the number of the state’s college students housed in 
dormitories. Several features of the plot’s shape stand out. For example, the plot is 
fan shaped, with many points bunched in the lower left: most states have relatively 
small urban populations (a couple of million or so) and relatively small dormitory 
populations as well (under 50,000); only a few states have very large urban 
populations or very large dormitory populations, and the variability from state to 
state is larger (more space between points) for the states with larger values. The 
pattern of association between the two variables is positive and strong: smaller 
urban populations go with smaller dormitory populations, larger urban populations 
with larger dormitory populations and, for all but a few of the states, knowing the 
size of a state’s urban population allows us to predict its dormitory population to 
within a fairly narrow range. 

Despite the nice fit between picture and story, the analysis so far has over- 
looked a most important feature. If we take at face value the pattern that states 
with large urban populations also have large dormitory populations, we might be 
tempted to conclude that cities must attract colleges. Although plenty of confirm- 
ing instances come to mind, this naive interpretation is wrong: -both our variables 
are indirect measures of the size of the states’ populations, so it is hardly surprising 
that the two measures show a strong positive association. To uncover a more 
meaningful relationship, we have to “adjust for the lurking variable:” divide urban 
population by total population to get percent urban, divide dormitory popula- 
tion by total population to get percent living in dormitories, and plot the result 
(Figure 8). 
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Figure 8. Scatterplot of the dorms-and-cities data after adjusting for the “lurking variable” population. 


Now the relationship is weaker, but what it tells us is more interesting. The 
direction is reversed: rural states—those with a lower percentage of their residents 
living in metropolitan areas—have a higher percentage of their residents living 
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in college dormitories. On reflection, this makes sense. Think about Pullman, 
Washington, or Ames, Iowa; about Norman, Oklahoma, or Lawrence, Kansas. 
Rural states may have fewer colleges and universities in absolute numbers, but 
their students make up a higher percentage of the total population of the state, 
and are more likely to live in dormitories. 


2.3. Formal inference: the argument against chance. Statistical inference provides 
methods for drawing conclusions from data about the population or process from 
which the data were drawn. It now becomes essential (as it was not in data 
analysis) to distinguish sample statistics from population parameters. The true 
values of the parameters are unknown to us. We have the statistics in hand, but 
they would take different values if we repeated out data production. Inference 
must take this sample variability into account. 

Probability describes one kind of variability, the chance variability in random 
phenomena. When a chance mechanism is explicitly used to produce data, proba- 
bility therefore describes the variation we expect to see in repeated samples from 
the same population or repeated experiments in the same, setting. That is, 
probability answers the question, ‘What would happen if we did this many times?” 
Standard statistical inference is based on probability. It offers conclusions from 
data along with an indication of how confident we are in the conclusions. The 
statement of confidence is based on asking “What would happen if I used this 
inference method many times?” That is exactly the kind of question probability can 
answer, which is why we ask it. The indication of our confidence in our methods, 
expressed in the language of probability, is what distinguishes formal inference 
from informal conclusions based on, e.g., an exploratory analysis of data. 

Any particular inference procedure starts with a statistic, perhaps several 
statistics, calculated from the sample data. The sampling distribution is the proba- 
bility distribution that describes how this statistic would vary if we drew many 
samples from the same population. In elementary statistics we present two types of 
inference procedures, confidence intervals and significance tests. A confidence 
interval estimates an unknown parameter. A significance test assesses the evidence 
that some sought-after effect is present in the population. 

A confidence interval consists of a recipe for estimating an unknown parameter 
from sample data, usually of the form “estimate + margin of error” and a confi- 
dence level, which is the probability that the recipe actually produces an interval 
that contains the true value of the parameter. That is, the confidence level answers 
the question, “If I used this method many times, how often would it give a correct 
answer?” 

A significance test starts by supposing that the sought-after effect is not present 
in the population. It asks “In that case, is the sample result surprising or not?” A 
probability (the p-value) says how surprising the sample result is. A result that 
would rarely occur if the effect we seek were absent is good evidence that the 
effect is in fact present. Figure 9 illustrates this reasoning in our medical example. 
The normal curves in that figure represent the sampling distribution of the 
difference x — y between the mean blood pressure decreases in the calcium and 
placebo groups, for the case of no difference between the two population means. 
This distribution, which shows the variability due to chance alone, has mean 0. 
Outcomes greater than 0 come from experiments in which calcium reduces blood 
pressure more than the placebo. If we observe result A, we are not surprised; an 
outcome this far above 0 would often occur by chance. It provides no credible 
evidence that calcium beats the placebo. If we observe result B, on the other hand, 
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0 A B 0 C 


Figure 9. The idea of statistical significance: is this observation surprising? 


the experiment has produced an effect so strong that it would almost never occur 
simply by chance. We then have strong evidence that the calcium mean does 
exceed the placebo mean. The p-value (the right tail probability) is 0.24 for point 
A and 0.0005 for point B. These probabilities quantify just how surprising an 
observation this large is when there is no effect in the population. What about the 
actual data? Point C shows the observed value x — y = 5.273. The corresponding 
p-value is 0.055. Calcium would beat the placebo by at least this much in 5.5% of 
many experiments just by chance variation. The experiment gives some evidence 
that calcium is effective, but not extremely strong evidence. A note for those who 
worry about details: These p-value calculations took the variability of the sample 
means to be known. In practice, we must estimate standard deviations from the 
data. The resulting test has a larger p-value: p = 0.072. 


3. TEACHING. In ‘discussing our teaching, we may focus on content, what we 
want our students to learn, or on pedagogy, what we do to help them learn. These 
two topics are of course related. In particular, changes in pedagogy are often 
driven in part by changing priorities for what kinds of things we want students to 
learn. It is nonetheless convenient to address content and pedagogy separately. 
This section, in keeping with the rest of this article, concerns content, and in 
particular contains one side of a conversation between statisticians and mathemati- 
cians who may find themselves teaching statistics. 


3.1. Statistics should be taught as statistics. Statisticians are convinced that 
Statistics, while a mathematical science, is not a subfield of mathematics. Like 
economics and physics, statistics makes heavy and essential use of mathematics, yet 
has its own territory to explore and its own core concepts to guide the exploration. 
Given those convictions, we would naturally prefer that beginning statistics be 
taught as statistics. The American Statistical Association and the MAA have 
formed a joint committee to discuss the curriculum in elementary statistics. The 
recommendations of that group reflect the view that statistics instruction should 
focus on statistical ideas. Here are some excerpts [10]; a longer discussion appears 
in [11]: 


Almost any course in statistics can be improved by more emphasis on data 
and concepts, at the expense of less theory and fewer recipes. To the 
maximum extent feasible, calculations and graphics should be automated. 
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Any introductory course should take as its main goal helping students to 
learn the basics of statistical thinking. [These include] the need for data, the 
importance of data production, the omnipresence of variability, the quantifi- 
cation and explanation of variability. 


The recommendations of the ASA /MAA committee reflect changes in the field of 
statistics over the past generation. Academic statistics, unlike mathematics, is 
linked to a larger body of non-academic professional practice. Computing technol- 
ogy has completely changed the practice of statistics. Academic researchers, driven 
in part by the demands of practice and in part by the capability of new technology, 
have changed their taste in research. Bootstrap methods, nonparametric data 
smoothing, regression diagnostics, and more general classes of models that require 
iterative fitting are among the recent fruits of renewed attention to analysis of data 
and scientific inference. Efron and Tibshirani [14] describe some of this work for 
non-specialists. 


3.2. Neither Mathematics Nor Magic. An over-emphasis on probability-based 
inference is one mark of an overly mathematical introduction to statistics, and yet 
the reluctance of mathematically trained teachers to abandon a theory-driven 
presentation of basic statistics has a respectable basis: to avoid presenting statistics 
as magic. It is certainly common to teach beginning statistics as magic. The user of 
Statistics is in many ways very like the sorcerer’s apprentice. The incantation has an 
automatic effectiveness, rendering theses acceptable and studies publishable. We 
are not meant to understand how the incantation works—that is the domain of the 
sorcerer himself. The incantation must follow the recipe exactly, lest disaster ensue 
—exploration and flexibility, like understanding, are forbidden to the apprentice. 
Fortunately, the sorcerer has provided software that automates the exact following 
of approved incantations. 

The danger of statistics-as-magic is real. But the proper defense is not a retreat 
to a mathematical presentation that is inadequate to the subject and often 
incomprehensible to students. Mathematical understanding is not the only kind of 
understanding. It is not even the most helpful kind in most disciplines that employ 
mathematics, where understanding of the target phenomena and core concepts of 
the discipline take precedence. We should attempt to present an intellectual 
framework that makes sense of the collection of tools that statisticians use and 
encourages their flexible application to solve problems. Students understand 
mathematics when they appreciate the power of abstraction, deduction, and 
symbolic expression, and can use mathematical tools and strategies flexibily in 
dealing with varied problems. Reasoning from uncertain empirical data is a 
similarly powerful and pervasive intellectual method. How can we best lead our 
students to understand, appreciate, and begin to assimilate this intellectual method? 


3.3. Begin with exploratory data analysis. Although the implied chronology of 
Figure 2 suggests starting with data production, experience says otherwise. For one 
thing, exploratory data analysis makes a better beginning because it is more 
concrete. There is no need to distinguish population and sample, and no need to 
discuss the features of randomization that protect against bias. Basic methods are 
conceptually and algorithmically simple, and the data are in hand—actual num- 
bers on a page, as opposed to mere ghosts of data-in-the-future, the way they are 
in designing an experiment. Moreover, providing motivation is not a problem. 
Students like exploratory analysis and find that they can do it, a substantial bonus 
when teaching a subject feared by many. Engaging them early on in the interpreta- 
tion of results, before the harder ideas come along to claim their attention, can 
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help establish good habits that pay dividends when you get to inference. Finally, 
starting with data analysis prepares for design and for inference. Experience with 
data distributions introduces students to the omnipresence of variability, and to 
the potential for bias, the two main reasons we need careful design. If you teach 
design before data analysis, it is harder for students to understand why design 
matters. Experience with data distributions is also the best way to get ready to 
tackle the difficult idea of a sampling distribution. 

We have tried to suggest that there is a coherent (though not mathematical) set 
of ideas and associated tools for exploring data. Students need to practice these 
ideas and tools by writing coherent descriptions of data. To help them, we provide 
both outlines for what to write, and examples that can serve as models. Figure 10, 
for example, is the outline for describing a single quantitative variable. 

A. Describe the data 

number of observations 
nature of the variable 


how it was measured 
units of measurement 


B. Plot the data, choose from 


dotplot 
stemplot 
histogram 


C. Describe the overall pattern 


shape 
no clear shape? 
skew or symmetric? 
single or multiple peaks? 


center and spread; choose from 


five-number summary 
mean and standard deviation 


is normality an adequate model (normal quantile plot)? 
D. Look for striking deviations from the overall pattern 


outliers 
gaps or clusters 


E, Interpret your findings in C and D in the language of the problem setting. Suggest plausible 
explanations for your findings. 


Figure 10. Outline for describing data on a single quantitative variable. 


Following this outline requires both knowledge of the tools it mentions and 
judgment to choose among them and interpret the results. Judgment is formed by 
experience with data. Students cannot at first “read’’ graphs any more than they 
can read words or equations. Here is an example of a basic one-variable data 
analysis. Describing relations among several variables requires more elaborate 
tools and finer judgment. 


In a study of resistance to infection [2], researchers injected 72 guinea pigs 
with tubercle bacilli and measured their survival time in days after infection. 
Both a histogram (Figure 11) and a normal quantile plot (Figure 12) show 
that the distribution of survival times is strongly skewed to the right. There 
are no oOutliers—although some individuals survived far longer than the 
average, this appears to be a characteristic of the overall distribution rather 
than pointing to, for example, errors in measuring or recording these individ- 
uals. 
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Figure 11. Histogram of guinea pig survival times. 


Standard normal quantiles 
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Figure 12. Normal quantile plot for guinea pig survival times. 


The strong skewness suggests that the five number summary (min = 43 days, 
first quartile = 82.5 days, median = 102.5 days, third quartile = 151.5 days, 
max = 598 days) is a better numerical summary than the mean and standard 
deviation (x = 141.8 days, s = 109.2 days). There is very large variation in 
survival times among the individuals—for example, the third quartile is 
almost 150% of the median and the largest 6 observations are more than 
double the median. Without more information, we cannot accurately predict 
the survival time of an infected individual. Moreover, standard ¢ procedures 
should not be used for inference about survival time. Inference could employ 
a non-normal distribution as a model or seek a transformation to a scale that 
is more nearly normal. 

Although many students come to a first statistics course expecting empty 
ritual, EDA offers them the pleasant surprise that the methods exist to serve 
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the search for meaning. This surprise is so welcome that it carries a danger of 
pushing the pendulum too far the other way. Some students may drift into a 
complacent conviction that any story about the data that fits the patterns 
with coherence and plausibility must be true. The timing is right for a dose of 
design and skepticism. 


3.4. Teach design as the bridge between data analysis and inference. An introduc- 
tion to design for data production fits naturally between exploratory analysis and 
inference: sound design is what makes inference possible. Waiting to introduce 
probability distributions until after the basics of design has a number of advan- 
tages. For one thing, this order helps make clear that the justification for 
probability models must come from the randomness in the data production 
process, and so provides some protection against unthinking adoption of probabil- 
ity models. For another, learning about data production introduces students to 
essential concepts like population and sample, parameter and statistic, before they 
encounter the sampling distribution, which is conceptually difficult all by itself. 

The single most important point for students to understand is why randomized 
comparative experiments are the gold standard for evidence of causation. A rich 
source of true-life cautionary tales is the book [6], edited by the physicians Bunker 
and Barnes and the statistician Mosteller, which contains striking examples of 
medical treatments that became standard in the days before medicine adopted 
randomized comparative experiments, and were found to be worthless when 
subjected to proper testing. 

There is of course more to the statistical side of designing experiments and 
sample surveys than “randomize.” The designs used in practice are often quite 
complex, and must balance efficiency with the need for information of varying 
precision about many factors and their interactions. Simple designs—randomized 
experiments comparing two or several treatments, simple random samples from 
one or several populations—illustrate the most important ideas and support the 
inference taught in a first statistics course. You must talk about these designs, but 
need not go farther. Some other important material, for example, procedures for 
developing and testing survey questions and for training and supervising interview- 
ers, iS not usually presented in statistics courses. Statistics students should be 
aware that these practical skills do matter, and that data production can go awry 
even when we start with a sound statistical design. How much time to spend here is 
a matter of your judgment of the needs of your audience. 


3.5. Inference: two barriers to understanding. Section 2.3 has described briefly 
how inference works. Because the details are in practice automated, we would like 
students to put most of their effort into grasping the ideas. They are not easy to 
grasp. The first barrier is the notion of a sampling distribution. Choose a simple 
setting, such as using the proportion p of a sample of workers who are unem- 
ployed to estimate the proportion p of unemployed workers in an entire popula- 
tion. Physical examples (sampling beads from a box), computer simulations, and 
encouraging thought experiments all help convey the idea of many samples with 
many values of p. Keep asking, ““What would happen if I did this many times?” 
That question is the key to the logic of standard statistical inference. 

Once the idea of a sampling distribution begins to settle, the tools of data 
analysis help us take the next steps. Faced with any distribution, we ask about 
shape, center, and spread. The shape of the sampling distribution of p is approxi- 
mately normal. The mean is equal to the unknown population proportion p. This 
says that p as an estimator of p has no bias, or systematic error. The precision of 
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the estimator is described by the spread of the sampling distribution, which (thanks 
to normality) we measure by its standard deviation. We are now only details away 
from confidence intervals. 

The second major barrier is the reasoning of significance tests. Although the 
basic idea (“Is this outcome surprising?”’) is not recondite, the details are daunting. 
There’s no escape from null and alternative hypotheses and one- versus two-sided 
tests. The logic of testing, which starts out “Suppose for the sake of argument that 
the effect we seek is not present...” isn’t straightforward. We’d like most of our 
students to understand the idea of a sampling distribution; we know that quite a 
few won’t understand the reasoning of significance tests. Our fallback position is to 
insist that they be able to verbalize the meaning of p-values produced by software 
or reported in a journal. This is part of insisting that students write succinct 
summaries of statistical findings. “The study compared two methods of teaching 
reading to third-grade students. A two-sample ¢ test comparing the mean scores of 
the two treatment groups on a standard reading test had p-value p = 0.019. That 
is, the study observed an effect so large that it would occur just by chance only 
about 2% of the time. This is quite strong evidence that the new method does 
result in a higher mean score than the standard method.” 

Two concluding remarks about inference. First, a conceptual grasp of the ideas 
is almost pictorial, based on picturing the sampling distribution and following the 
tactics learned in data analysis. No amount of formal mathematics can replace this 
pictorial vision, and no amount of mathematical derivation will help most of our 
students see the vision. The mathematics is essential to our knowing the facts, but 
this does not imply that we should impose the mathematics on our students. 

Second, we want our students to know a good deal more than the big picture 
and several recipes that implement it in specific settings. Here are some further 
points, both practical and conceptual, roughly in order of importance. How far 
down the list you should go depends on your audience. 


Study of specific inference procedures reveals behaviors that are common and 
that all students should understand. To get higher confidence from the same 
data, you must pay with a larger margin of error. Even effects so small as to 
be practically unimportant are highly significant in the statistical sense if we 
base a significance test on a very large sample. 

Lots of things can go wrong that make inference of dubious value. Comparing 
subjects who choose to take calcium against others who don’t tells little about 
the effects of calclum, because those who choose to take calcium may be very 
health-conscious in general. One extreme outlier could pull the conclusion of 
our medical experiment in either direction, again invalidating the inference. 
Examine the data production. Plot the data. Then, perhaps, go on to infer- 
ence. 

Inference procedures themselves don’t tell us that something went wrong. The 
margin of error in a confidence interval, for example, includes only the 
chance variation in random sampling. As the New York Times says in the box 
that accompanies its opinion poll results, “In addition to sampling error, the 
practical difficulties of conducting any survey of public opinion may introduce 
other sources of error into the poll.” 

¢ Common inference procedures really are based on mathematical models like 
the one that appears in our medical example: X,, X,,..., X, iid NC, o,), 


n 


Y,,Y,,---5¥,, iid NC 5, o>). This model isn’t exactly true; is it useful? In fact, 


2“ mM 


the two-sample ¢ procedures that follow from this model when we want to 


1997] MATHEMATICS, STATISTICS, AND TEACHING 819 


compare pw, and pw, are quite robust against non-normality, so the model does 
lead to practically useful procedures. But the variance ratio F statistic for 
comparing ao, and a, is extremely sensitive to non-normality, so much so that 
it is of little practical value. Even beginners need to be aware of such issues. 
We often want to do inference when our data do not come from a random 
sample or randomized comparative experiment. Think, for example, of mea- 
surements on successive parts flowing from an assembly line. Inference is 
justified by a probability model for the process that produced our data, and 
the correctness of the model can to some extent be assessed from the data 
themselves. Randomized data production is the paradigm and the most secure 
setting for inference, but it is not the only allowable setting. 

Inductive inference from data is conceptually complex. It’s not surprising that 
there are alternative ways of thinking about it. Standard statistical theory 
tends to think of inference as if its purpose were to make decisions. A test 
must decide between the null and alternative hypotheses, for example. This 
leads at once to Type I and Type II errors and so on. The decision-making 
approach fits uneasily with the “Is this outcome surprising?” logic expressed 
by p-values. We think that assessing the strength of evidence is a much more 
common goal than making a decision, but not everyone agrees. The Bayesian 
school of thought goes farther, by introducing an explicit description of the 
available prior information into any Statistical setting and combining prior 
information with data to reach a decision. Almost all statisticians think this is 
sometimes a good idea. Bayesians think all statistical problems can be made 
to fit this paradigm. This is a (strongly held) minority position. Deep water 
ahead. 


3.6. What About Probability? Probability is an essential part of any mathematical 
education. It is an elegant and powerful field of mathematics that enriches the 
subject as a whole: by its interactions with other fields of mathematics. Probability 
is also essential to serious study of applied mathematics and mathematical model- 
ing. The domain of determinism in natural and social phenomena is limited, so 
that the mathematical description of random behavior must play a large role in 
describing the world. Whether our mathematical tastes run to purity or modeling, 
probability helps to satisfy them. Here, however, we are discussing introductory 
Statistics rather than mathematics. 

From the point of view of deductive logic that has shaped so much of statistical 
teaching in the past, probability is more basic than statistics: probability provides 
the chance models that describe the variability in observed data. From the point of 
view of the development of understanding, however, we believe that statistics is 
more basic than probability: whereas variability in data can be perceived directly, 
chance models can be perceived only after we have constructed them in our own 
minds. In the ideal Platonic world of mathematics, we can start with a probabilistic 
chicken and use deductive logic to lay a statistical egg, but in the messier world of 
empirical science, we must start with the egg as observed data and construct a 
prior probabilistic chicken as an inference. In an introductory statistics course, the 
chicken’s only value is to explain where eggs come from. It seems a bit unfair, in 
that context, at least, to ask beginning students to learn about egg-generators 
before they’ve become familiar with eggs—less extreme, but in the same spirit as 
starting the study of chemistry with quantum mechanics. 

What then, should be the place of probability in beginning instruction in 
statistics? Our position is not standard, though it is gaining adherents: first courses 
in Statistics should contain essentially no formal probability theory. 
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Why? First, because informal probability is sufficient for a conceptual grasp of 
inference. Although the theoretical structure of standard statistical inference is 
based on probability, the role of probability is limited to answering the question 
“What would happen if we used this method very many times?” The answer is 
given by the sampling distribution of a statistic, which records the pattern of 
variation of the outcomes of, for example, many random samples from the same 
population. If we agree that actually deriving these distributions is better left to 
more advanced study, they can be understood as distributions using the tools of 
data analysis, without the apparatus of formal probability. Rules for P(A U B) add 
very little to a statistics course. 

The second reason to avoid formal probability is that probability is conceptually 
the hardest subject in elementary mathematics. The history of probabilistic ideas (see 
[16] and [27]) is fascinating but a bit frightening. Better minds than ours long found 
the subject confusing in the extreme. Psychologists, beginning with Tversky and his 
collaborators, have demonstrated that confusion persists, even among those who 
can recite the axioms of formal probability and who can do textbook exercises. Our 
intuition of random behavior is gravely and systematically defective; see, e.g., [28] 
and the collection [19]. What is worse, mathematics educators have found no 
effective way to correct our defective intuition. Garfield and Ahlgren [15] conclude 
a review of research by stating that “teaching a conceptual grasp of probability still 
appears to be a very difficult task, fraught with ambiguity and illusion.” They 
suggest study of “how useful ideas of statistical inference can be taught indepen- 
dently of technically correct probability.” We believe that concentrating on the idea 
of a sampling distribution allows this, at least at the depth appropriate for 
beginners. 

The concepts of statistical inference, starting with sampling distributions, are of 
course also quite tough. We ought to concentrate our attention, and ours students’ 
limited patience with hard ideas, on the essential ideas of statistics. We faculty 
imagine that formal probability illumines those ideas. That’s simply not true for 
almost all of our students. 


3.7. What About Mathematics Majors? Mathematics majors traditionally meet 
Statistics as the second course in a year-long sequence devoted to probability and 
statistical theory. We hope it is clear that we don’t regard a tour of sufficient 
Statistics, unbiasedness, maximum likelihood estimators, and the Neyman-Pearson 
theorem as a promising way to help students understand the core ideas of 
statistics. On the other hand, mathematics majors should certainly see some of the 
mathematical structure of statistical inference. What ought we do? 

Our preference is to precede the study of theory by a thorough data-oriented 
introduction to statistical ideas and methods and their applications. That is, 
mathematics students are not necessarily an exception to the principle that a first 
introduction to statistics should not be based on formal probability. If the students 
have strong quantitative backgrounds, a data-oriented course can move quickly 
enough to present genuinely useful statistics and serious applications. The need for 
theory can be made clear as we face issues of practice, and the theory makes much 
more sense when its setting in practice is clear. In many institutions, however, 
constraints or faculty hesitation make this path difficult. In others, there is little 
coordination between the “applied” and theoretical courses, so that the latter does 
not in fact build on the former. 

We ought therefore to reconsider what a one-semester introduction to statistics 
for mathematics majors and other quantitatively strong students should look like. 
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This course would ordinarily and most easily follow a course in probability. Here 
we encounter another barrier: we can’t in good conscience retool both semesters of 
the standard probability-statistics sequence to cptimize the introduction to statis- 
tics. Probability is important in its own right, not just as preparation for statistical 
theory. The more emphasis a department places on applications and modeling in 
its major curriculum, the more the probability course must play an essential role in 
this emphasis. An introduction to probability that emphasizes modeling and 
includes simulation and numerical calculation certainly sets the stage for statistics, 
but we are hesitant to move any strictly statistical ideas into the probability 
semester. The reform of probability and the reform of statistics are distinct issues. 

Our goal should be an integrated statistics course that moves through data 
analysis, data production, and inference in turn, emphasizing the organizing 
principles of each. We should certainly take advantage of and strengthen the 
student’s mathematical capacities. Although data analysis and data production 
have no unifying theory, mathematical analysis can illumine even data analysis. 
Here are a few examples. 


e A. Consider the optimality properties of measures of center for n observa- 
tions. The mean minimizes the mean squared error; the median minimizes 
the mean absolute error (and need not be unique); the midrange mini- 
mizes the maximum absolute (or squared) error; try minimizing the median 
absolute error for n = 3 and examine the unpleasant behavior of the 
resulting measure. 

¢ B. Students met the Chebychev inequality while studying probability. Now 
they may meet the interesting inequality | — m| < o linking the mean, 
median, and standard deviation of any distribution [29]. Describe one- 
sample data by the empirical distribution (probability 1/n on each ob- 
served point) to draw conclusions about how far apart the sample mean 
and median may be. 

¢ C. The least-squares regression line is the analog of the mean x for predict- 
ing y from x. Derive it. Then explore, perhaps using software, analogs of 
the other measures mentioned in A. 


Data production lends itself to probability calculations that illustrate how likely 
it is that random assignments will be unbalanced in specific ways; the advantages 
of large samples soon become clear. 

Very nice. We can give our students a balanced introduction to statistics that 
makes use of their knowledge of mathematics. The inevitable consequence is that 
we spend less time on inference. We must decide what to preserve and what to cut. 
There is as yet no consensus, because, despite much grumbling, the reform of the 
math major sequence has not yet begun. Imagining such a reform is a good place 
to end a discussion of statistics, mathematics, and teaching. This is your take-home 
exam: design a better one-semester statistics course for mathematics majors. 


REFERENCES 


1. Aldous, David (1994), Triangulating the circle, at random, Amer. Math. Monthly 101, 223-233. 
The remark appears in the biographical note accompanying the paper. 

2. Bjerkedal, T. (1960), Acquisition of resistance in guinea pigs infected with different doses of 
virulent tubercle bacilli, American Journal of Hygiene 72, 130-148. 

3. Boyer, Paul and Stephen Nissenbaum (1972). Salem Village Witchcraft. Belmont, CA: Wadsworth 
Publishing Co. 

4. Boyer, Paul and Stephen Nissenbaum (1974). Salem Possessed. Cambridge, MA: Harvard Univer- 
sity Press. 


822 MATHEMATICS, STATISTICS, AND TEACHING [November 


Bullock, James O. (1994), Literacy in the language of mathematics, Amer. Math. Monthly 101, 
735-743. 

Bunker, John P., Benjamin A. Barnes, and Frederick Mosteller (eds.) (1977), Costs, Risks and 
Benefits of Surgery. New York: Oxford University Press. 

Chambers, John M., William S. Cleveland, Beat Kleiner, and Paul A. Tukey (1983), Graphical 
Methods for Data Analysis. Belmont, CA: Wadsworth. 

Chambers, John M. and Trevor J. Hastie (1992), Statistical Model in S. Pacific Grove, CA: 
Wadsworth. 

Cleveland, William S. and Mary E. McGill (eds.) (1988), Dynamic Graphics for Statistics. Belmont, 
CA: Wadsworth. 

Cobb, George W. (1991), Teaching statistics: more data, less lecturing, Amstat News, December 
1991, pp. 1, 4. 

Cobb, George W. (1992), Teaching statistics, in L. A. Steen (ed.) Heeding the Call for Change: 
Suggestions for Curricular Action, MAA Notes 22. Washington, DC: Mathematical Association of 
America. 

Cochran, W. G. (1968). The effectiveness of adjustment by subclassification in removing bias in 
observational studies, Biometrics 24, 205-213. 

Crystal, David (ed.) (1994), The Cambridge Factfinder. Cambridge: Cambridge University Press, 
pp. 174-175. 

Efron, Bradley and Rob Tibshirani (1991), Statistical data analysis in the computer age, Science 
253, 390-395. 

Garfield, Joan and Andrew Ahlgren (1988), Difficulties in learning basic concepts in probability 
and statistics: implications for research, Journal for Research in Mathematics Education 19, 44-63. 
Gigerenzer, G., Z. Swijtink, T. Porter, L. Daston, J. Beatty, and L. Kriiger (1989) The Empire of 
Chance. Cambridge: Cambridge University Press. 

Hoaglin, D. C. (1992), Diagnostics, in D. C. Hoaglin and D. S. Moore (eds.), Perspectives on 
Contemporary Statistics, MAA Notes 21. Washington, DC: Mathematical Association of America, 
pp. 123-144. 

Hoaglin, David C. and David S. Moore (eds.) (1992), Perspectives on Contemporary Statistics, MAA 
Notes 21. Washington, DC: Mathematical Association of America. 

Kapadia, R. and M. Borovenik (eds.) (1991), Chance Encounters: Probability in Education. 
Dordrecht: Kluwer. 

Longfellow, Henry Wadsworth (1847), Evangeline, Introduction, 1.1. 

Lyle, Roseann M. et al. (1987), Blood pressure and metabolic effects of calcium supplementation 
in normotensive white and black men, Journal of the American Medical Association 257, 1772-1776. 
Dr. Lyle provided the data in the example. 

Moore, David S. (1988), Should mathematicians teach statistics (with discussion), College Math. 
Journal 19, 3-7. 


23. Moore, David S. (1992), What is statistics? in David C. Hoaglin and David S. Moore (eds.), 
Perspectives on Contemporary Statistics, MAA Notes 21. Washington, DC: Mathematical Associa- 
tion of America, pp. 1-18. 

24. Moore, David S. (1992), Teaching statistics as a respectable subject, in Florence Gordon and 
Sheldon Gordon (eds.), Statistics for the Twenty-First Century, MAA Notes 26. Washington, DC: 
Mathematical Association of America. 

25. Moore, David S. (1995), The Basic Practice of Statistics. New York: W. H. Freeman. 

26. Rosenbaum, Paul R. (1995), Observational Studies. New York: Springer-Verlag, p. 60. 

27. Stigler, S. M. (1986), The History of Statistics: The Measurement of Uncertainty Before 1900. 
Cambridge, Mass: Belknap. 

28. Tversky, Amos and Daniel Kahneman (1983), Extensional versus intuitive reasoning: The conjunc- 
tion fallacy in probability judgment, Psychological Review 90, 293-315. 

29. Watson, G. S. (1994), letter to the editor, The American Statistician 48, p. 269. This is the last in a 
sequence of comments on this inequality, and contains references to the earlier contributions. 

30. Weisberg, Sanford (1985). Applied Linear Regression, 2nd edition. New York: John Wiley and 
Sons, p. 230. 

Department of Mathematics, Statistics Department of Statistics 

and Computer Science Purdue University 

Mount Holyoke College West Lafayette, IN 47907 

South Hadley, MA 01075 dsm @stat.purdue.edu 

gcobb@mtholyoke.edu 


1997] MATHEMATICS, STATISTICS, AND TEACHING 823 


When Is A Linear Operator Diagonalizable? 


Marco Abate 


INTRODUCTION. As it often happens, everything began with a mistake. I was 
teaching for the third year in a row a linear algebra course to engineering 
freshmen. One of the highlights of the course was eigenvector theory, and in 
particular the diagonalization of linear operators on finite-dimensional vector 
spaces (i.e., of square real or complex matrices). Toward the end of the course I 
assigned a standard homework: prove that the matrix 


—-1 -1 2 
A= —] 0 1}, 
O —-1 1 


is diagonalizable. Easy enough, I thought. The characteristic polynomial is 
p4(A) = det(A — AL,) = —A’ +A, 


whose roots are evidently 0,1, —1. We have three distinct eigenvalues in a 
three-dimensional space, and a standard theorem ensures that A is diagonalizable. 

To my surprise, the students came complaining that they were unable to solve 
the exercise. Perplexed (some of the complaining students were very bright), I 
looked over the exercise again—and I understood. What happened was that, in the 
homework, I actually gave them the matrix 


—-1 -l 2 
=|-1 0 1], 
O -1 -Il 


whose characteristic polynomial is 
Pp(A) = -A -— 2d°- A442, 


which has no rational roots. The students were unable to compute the eigenvalues 
of B, and they got stuck. 

This accident started me wondering whether it might be possible to decide when 
a linear operator T on a finite-dimensional real or complex vector space is 
diagonalizable without computing the eigenvalues. If one is looking for an ortho- 
normal basis of eigenvectors, the answer is well known to be yes: the spectral 
theorem says that such a basis exists in the complex case if and only if T is normal 
(i.e., it commutes with its adjoint), and if and only if T is symmetric in the real 
case. The aim of this note is to give an explicit procedure to decide whether a 
given linear operator on a finite-dimensional real or complex vector space is 
diagonalizable. By “explicit” I mean that it can always be worked out with pen and 
paper; it can be long, it can be tedious, but it can be done. Its ingredients (the 
minimal polynomial and Sturm’s theorem) are not new; but putting them together 
yields a result that can be useful as an aside in linear algebra classes. 
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1. THE MINIMAL POLYNOMIAL. The first main ingredient in our procedure is 
the minimal polynomial. Let T: V ~ V be a linear operator on a finite-dimen- 
sional vector space over the field K. We denote by T* the composition of T with 
itself k times, and for any polynomial p(t) = a,t* + +++ +a) € K[t] we put 

p(T) =a,T* ++ +a,T + idy, 
and say that p is monic if a, = 1. A minimal polynomial w, € Kit] of the linear 
operator T is a monic polynomial of minimal degree such that w(T) = 0. 


The theory of the minimal polynomial is standard. For completeness, I briefly 
recall the results we shall need. First of all: 


Proposition 1.1. Let T; V — V be a linear operator on a finite-dimensional vector 
space V over the field \K. Then: 


(i) the minimal polynomial 2, of T exists, has degree at most n = dim V, and is 
unique; 

Gi) if p € K[t] is such that p(T) = 0, then there is some q € K[t] such that 
P= 4Pr- 


For our procedure it is important to show that the minimal polynomial can be 
explicitly computed. Take v € V, and let d be the minimal non-negative integer 
such that the vectors {v, T(v),...,7“(v)} are linearly dependent. Clearly d <n 
always; d = 0 if and only if v = 0, and d = 1 if and only if v is an eigenvector 
of T. Choose dy,...,@4-, © K such that 


T4(v) + ag_1T*'(v) + +a,T(v) + aguv = 0 


(note that we can assume the coefficient of T“(v) to be 1 because of the 
minimality of d), and then set 

ber (t) = 09 +ag_t0* + +ayt + ay € Ke]. 
By definition, v © Ker uw, ,T); more precisely, wr, is the monic polynomial 
p € Klt] of least degree such that v € Ker p(T). 

Now, if p € K[¢] is any common multiple of mw; and mr, for any two 
vectors v, and v,, then both v, and v, belong to Ker p(T). More generally, if 
B ={vj,...,v,} is a basis of V, and p is any common multiple of wr,,.--,Mr,v> 
then @ C Ker p(T), and thus p(T) = 0. Hence the following result comes as no 
surprise: 


Proposition 1.2. Let T: V — V be a linear operator on a finite-dimensional vector 
space V over the field K. Let & = {v,,...,v,} be a basis of V. Then py is the least 
common multiple of br, y>+-+> PT, v," 


Proof: Let p © KK[t] be the least common multiple of wr,,,...,Mr,),. We have 
already remarked that PT) = 0, and so yp divides p. Conversely, for j = 1,...,7 
write br = 4; br, v + Up with deg r; < deg pr, vj" Then 


0= "(Dy — qj (T)( Mr, ,(T)4,) + r(T)y =r(T)u;, 
and the minimality of the degree of uw; 4 forces r; = 0. Since every pr, Y, divides 


47, their least common multiple p also divides fty, and hence p = pr. = 


Thus one method to compute the minimal polynomial is to compute the 
polynomials wy ,,---,@r,,, and then find their least common multiple. To avoid 
unnecessary calculations, it could be useful to remember that deg pw, <n. 
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Example 1. Let us compute the minimal polynomials of the matrices A and B of 
the introduction. Let & = {e,, e,, €;} be the canonical basis of R°. We have 


—] 2 
Ae, = Be, = —1], A’e, = Be, = l|, 
0 1 
—] —] 
Ave, =|-1]=Ae,, Bre, =| -1] = —2B7e, — Be, + 2e;; 
0 —2 


therefore 
Mae(t)=t—t, Mp -(t) =O +20? +4-2. 


Since deg w4,, = 3 and the minimal polynomial of A should be a monic multiple 
of wy ., Of degree at most three, we can conclude that wy, =p, without 
computing py, and pw, ,, (and it is easy to check that wy ,(¢)=¢° —¢ and 
ba, At) =f - t), For the same reason we have pa, = MB e, 


Let 2,,..., A, € K be the distinct eigenvalues of 7. If T is diagonalizable, then 
Proposition 1.2 immediately yields w,(t) = (¢ — A,)-::(t — A,). This is the stan- 
dard characterization of diagonalizable linear operators: 


Theorem 1.3. Let T: V — V be a linear operator on a finite-dimensional vector space 
V over the field \K. Then T is diagonalizable if and only if fey is of the form 


Mr(t) = (6 Ay) (4 = Ag), (1.1) 
where X,,..., A, are distinct elements of KK. 


Therefore to decide whether a given linear operator on a finite-dimensional 
vector space is diagonalizable it suffices to check whether its minimal polynomial is 
of the form (1.1). 


2. THE PROCEDURE. Our aim now is to find an effective procedure to decide 
whether a given polynomial p € K[t] can be written in the form (1.1). To do so, we 
need to know when all the roots of p have multiplicity one, and when they all 
belong to the field K. The first question has a standard answer: 


Proposition 2.1. Let p € K[t] be a non-constant polynomial, and let p' € K{t] 
denote its derivative. Then the following assertions are equivalent: 


(i) p admits a root in KK of multiplicity greater than 1; 
Gi) p and p' have a common root in K; 
(iii) the greatest common divisor g.c.d.(p, p') of p and p’ has a root in K. 


Recalling Theorem 1.3 we get the following 
Corollary 2.2. Let T: V — V be a linear operator on a finite-dimensional vector space 
V over the field \K. Then: 


(i) If K is algebraically closed (e.g., K = C), then T is diagonalizable if and only 
if g.c.d.C ur, hr) = 1; 

Gi) If IK is not algebraically closed (e.g., K = R), then T is diagonalizable if and 
only if all the roots of 7 are in K and g.c.d.( wr, wy) = 1. 
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To decide whether a complex linear operator T is diagonalizable it then suffices 
to compute the greatest common divisor of ww, and w;. On the other hand, if 
KK = R this is not enough; to complete the picture we need Sturm’s theorem—and 
to state it we need a few more definitions. 

Let ¢ = (cp,...,¢,) € R°*' be a finite sequence of real numbers. If cy -*: c, # 
0, the number of variations in sign of c is the number of indices 1 <j <s such that 
c;_,c; < 0 (that is, such that c;_, and c; have opposite sign). If some element of ¢ 
is zero, then the number of variations in sign of c is the number of variations in sign 
of the sequence of non-zero elements of c. We denote the number of variations in 
sign of ec by MY. 

Now let p € R[t] be a non-constant polynomial. The standard sequence associ- 
ated with p is the sequence py,..., p, € Rit] defined by 


PoP, Pi=D'; 
Po =UP1 — Pr» with deg p, < deg p,, 


Pj-1 = q; Pj — Pyti> with deg Pj+i1 < deg Pj> 


Ps-1 ~ IsPs> that IS, Ps+1 = 0. 


In other words, the standard sequence is obtained by changing the sign in the 
remainder term of the Euclidean algorithm for the computation of g.c.d.(p, p’). In 
particular, g.c.d.(p, p') = 1 if and only if p, is constant. 

Sturm’s theorem then says: 


Theorem 2.3. Let p € R[t] be a polynomial such that g.c.d.(p, p') = 1, and take 
a<b such that p(a)p(b) #0. Let po,...,p, € Rit] be the standard sequence 
associated with p. Then the number of roots of p in \a, b] is equal to V, — V,,, where 
a = (p,(a),..., p,(a)) and b = (p,(b),..., p,(b)). 


For a proof see [1, pp. 295-299]. 

Now, for any polynomial p(t) = a,t“ + --- +a, € Rit] there exists M > 0 such 
that p(t) has the same sign as a,, the leading coefficient of p, if t > M and the 
same sign as (—1)“a, if t < —M. In particular, all the roots of p are contained in 
[—M, M], and Sturm’s theorem implies the following: 


Corollary 2.4. Let p € Rit] be a non-constant polynomial such that g.c.d.( p, p’') = 1. 
Let po,..-, Ps € Rit] be the standard sequence associated with p, and let d; be the 
degree and c; € R the leading coefficient of p; for j = 0,...,8. Then the number of 
real roots of p is given by V_— V,, where V_ is the number of variations in sign of the 
sequence ((—1)“°cy,...,(—1)“c,), and Vis the number of variations in sign of the 
sequence (Cy,..-5 Cs). 


Proof: It suffices to choose M > 0 large enough so that p,(t) has the same sign as 
c, when t > M and the same sign as (— 1)%ic; when t < —WM, for each j = 0,...,5, 
and then apply Sturm’s theorem with a = —M and b = M. = 
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We finally have all the ingredients necessary to state the desired procedure. Let 
T: V ~ V bea linear operator on a finite-dimensional vector space V over K = R 
or C. Then: 


(1) Compute the minimal polynomial p-. 

(2) Compute the standard sequence py,..., p, associated with w,. If p, is not 
constant, then T is not diagonalizable. If p, is constant and K = C, then T 
is diagonalizable. If p, is constant and K = R, go to Step (@). 

(3) Compute V_ and V, for w,;. Then 7 is diagonalizable if and only if 
V_—V,= deg py. 


Thus we are always able to decide whether a given linear operator on a 
finite-dimensional real or complex vector space is diagonalizable or not. One 
feature that I find particularly interesting in this procedure is that the solution of a 
typical linear algebra problem is reduced to an apparently totally unrelated 
manipulation of polynomials, showing in a simple case how different parts of 
mathematics can be connected in unexpected ways. 

We end this note with some examples of application of our procedure. 


Example 2. First of all, we solve the original homework. We have already com- 
puted the minimal polynomial w(t) = ¢° + 2t* + t — 2. The standard sequence 
associated with py 1s 


p(t) =t +207 +t-2, p(t) = 307+ 4t+4+ 1, 


p(t) =2t+ 3, p(t) = —261. 


Since p, is constant, B is diagonalizable over C. To compute V_— V, we count 
the number of variations in sign of the sequences (—1,3, — 4,—261) and 
(1,3, 4, —261). We obtain 


V_-V,=2-1=1<3=deg pp, 


and so B is not diagonalizable over R. On the other hand, the standard sequence 
associated with pw, is 


pt) =t—t, p(t) =3t?-1, p(t) =4t, p(t) = 1. 


The number of variations in sign of (—1,3, — 4,1) is V_= 3, and of (1,3 4,1) is 
V,= 0; therefore V_— V,= 3 — 0 = 3, and thus A is diagonalizable over R (as it 
should be). : 


Both these matrices were diagonalizable over C; since their minimal polynomi- 
als have degree 3, necessarily their (complex) eigenvalues are all distinct. In the 
next example this is not the case: 


Example 3. Let 
O —-2 2 6 
2 4 -1 —-5 


—3 -4 3 7\ 
1 2 -1 -3 
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To compute the minimal polynomial of C we start, as in Example 1, by applying 
the iterates of C to e,. We get 


0 -4 -8 -4 
2 6 6 —8§ 
1 4 6 0 


It is easy to check that {e,, Ce,, C*e,, C°e,} are linearly independent and that 
Cte, — 4C%e, + 8C%e, — 8Ce, + 4e, = 0; 
therefore deg c,,, = 4 and, as in Example 1, we can conclude that 
Mc(t) = Me,e(t) =t* — 40° + 8t* — 8+ 4. 
The standard sequence associated with pu starts with 
po(t) =t* — 40°? + 847 — 8 +4, p(t) = 4t° — 1247 + 16r — 8, 
p(t) = -t*° + 2t-2. 
Since p, divides p,, it is the last polynomial in the sequence; since it is not 


constant, we conclude that C is not diagonalizable even over C (and in particular it 
cannot have four distinct eigenvalues). 


Our final example involves a minimal polynomial of degree strictly less than the 
dimension of the space: 


Example 4. Let 
2 —2 2 8 
_|-2 4 —-1 —-9 
P=) 1 -4 3 uf 
—] 2 —-1 —5 
To compute the minimal polynomial of D we start again by applying the iterates of 
D to e,. We get 


2 2 
De, = “f , De, = a = 2De, — 2e,; 
—1 —2 


therefore Mp ,(t) = t? — 2t + 2, and we cannot conclude right now that wp, = 
/Ltp- Proceeding with the computations we get 


a) —4 2 

4 6 —1 

; 4 -| 
4 8 16 
—2 —5 —12 


hence we have wp, ., = Mpc, = Mp,e, = Mp, e, and pp(t) = t? — 2t + 2. In partic- 
ular, D has (at most) two distinct complex eigenvalues. 
The standard sequence associated with pp 1s 


Pot) =t°-— 2t+2, p(t) =2t-2, p(t) = -1. 
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Since p, is constant, D is diagonalizable over C. The number of variations in sign 
of the sequences (1, —2, — 1) and (1, 2, —1) is 

V_-V,=1-1=0<2=deg pp, 
and so D is not diagonalizable over R. 
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Energy Arguments in the Theory 
of Algorithms 


Eric Bach 


My goal is to convince you that physical arguments can provide valuable insight 
into the behavior of algorithms. In particular, several classic results from combina- 
torics and the theory of algorithms are easy to prove, once you have an appropriate 
physical model. These demonstrations have as their theme the simple but powerful 
idea of energy conservation: in any closed physical system, the total energy is 
constant. 

Certainly, physical ideas have influenced the theory of algorithms in other ways. 
One famous example is the concept of entropy, which has found wide use in 
information theory. For example, you are probably familiar with the result that 
log, mn! comparisons are required to sort 1 keys. I would like to argue here that 
energy is equally useful, and deserves a permanent place in your toolbox. Further- 
more, abstracting the essential ideas in energy conservation arguments leads to the 
method of amortized analysis, for which this article can serve as an introduction. 


1. PATH LENGTHS IN TREES. Let T be a binary tree, in which each node has 
at most 2 children. There is a unique node, called the root, with no parents. In the 
study of searching algorithms, an important role is played by the internal path 
length, defined by the following sum over the nodes of 7: 


P(T) = )) [number of edges between x and the root]. 


We also consider the external path length, defined by a sum over the “missing” 
children of the nodes of 7: 


E(T) = >; [number of edges between y and the root]. 
y 


It is traditional to call these missing children external nodes. 

It is well known that we minimize these functions by putting all the leaves of T 
on two adjacent levels, or if possible, on one level. We now show how this follows 
immediately from physical considerations. 

We imagine that the tree grows upward from the root, in a uniform gravitational 
field. Thinking of each node as having unit mass, we assign the root a potential 
energy of 0, its children each a potential of 1, its grandchildren 2, and so on. 
Evidently P(T) is the total potential energy of the tree, which is minimized by 
successively putting as many nodes as possible at each level. This gives the result 
for P(T). 

To derive the result for E(T), we give the external nodes unit negative mass. 
(Anyone bothered by negative mass can replace the gravitational field by an 
electric one and use positive and negative charges.) We now imagine a disassembly 
process, each step of which replaces one of the highest leaves of T and its two 
external node children by one external node. Each disassembly step increases the 
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potential energy of the tree by 2. If T has n nodes, there are n steps in this 
process, which transforms a tree of potential energy P(T) — E(T) into one with 
zero energy. Therefore, by conservation of energy, 


E(T) = P(T) + 2n, 


and any tree minimizing P(T) must also minimize E(T). 
A corresponding relation for d-ary trees, 


E(T) =(d—-1)P(T) + dn, 


can be proved in a similar manner. 


2. OPTIMAL SEARCHING ALGORITHMS. We can apply the path length results 
of the preceding section to algorithms that search for a given key x among 
X1,.-.,X,. These algorithms are to be composed of queries that determine whether 
x is larger than, equal to, or greater than another key x;. Each algorithm 
determines a binary tree as follows. The root of the tree is labelled by the first key 
compared to x, and the left and right subtrees are determined by the queries used 
when x is less than or greater than the root, respectively. 

Considering the root as level 1, we assign each node at level i the potential 
energy i. Thus, the energy of a node is the work (number of queries) needed to 
reach it, and if all nodes are equally likely to be sought, the average work is the 
total potential energy of the tree, divided by the number of nodes. This character- 
izes the average-case optimal search algorithms: they are exactly those whose 
leaves lie on two adjacent levels, or if possible, on one level. When n = 2* — 1, 
there is only one such tree, and we recover the usual binary search algorithm. 

A related problem was studied by Huffman [5] and can be phrased in our terms 
as follows. We are given particles of positive masses p,,..., p,. Place these at the 
external nodes of a binary tree so as to minimize the total potential energy. The 
optimal tree does not depend on our units for mass, so we may as well assume that 
the p,’s sum to 1. This then corresponds to a game in which we must identify one 
of n possible items, occurring with the probabilities p,,..., p,, using queries of 
the form “is x € S$”? 

We now derive Huffman’s algorithm for constructing an optimal tree, which 
determines a search strategy minimizing expected cost. In any minimal tree T, the 
two smallest masses p and g must be at the farthest distance from the root, for if 
not, we could swap one of them with the farthest leaf and decrease the potential. 
We don’t change the potential by moving masses horizontally, so we may as well 
make them the two children of some internal node x. Now, delete these two 
children and give their total mass p+q to the (now external) node x. This 
produces a new tree T’. This transformation releases an amount of energy equal to 
p+ q, so the weighted external path lengths /,, must satisfy 


I(T) =(p+q) +1,(7"). 


Recursively generating T’ by the same process, we solve the problem. 

In deriving Huffman’s algorithm, we have assumed that each node is a unit 
distance away from its parent. It is interesting to ask if we can further reduce the 
potential of T by allowing these lengths to be fractions. To prevent trivialities, we 
introduce the following constraint: if a node has two children at distances 7 and r, 
then 2% + 27” = 1. The resulting algorithm is the same as Huffman’s, except that 
it places p and gq at the distances log, ((q + p)/p) and log, ((q + p)/q) away 
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from their parent. You can verify that the resulting tree has potential energy 


H(Pi.--->Pn) = Yi — pilog, p;, 
i=1 


and that this never exceeds the potential energy of a tree constructed by Huffman’s 
algorithm. 

By this result, any binary tree with unit edge lengths and masses p,,..., p,, at 
the external nodes, summing to 1, has energy at least H. This relation is the source 
of many lower bounds on the performance of algorithms. For example, to prove 
the result on sorting mentioned in the introduction, we take (p,) to be the uniform 
distribution on the n! possible permutations of the input. Since H is the well-known 
entropy function, this provides one link between energy and information. For some 
others, see [10, p. 50]. 


3. COMPUTING MAX AND MIN. We now prove than any algorithm determining 
the minimum and maximum of a set of 1 keys by comparing them must use at least 
3n/2 — 2 comparisons in the worst case. This result is due to Pohl [11]. 

One way to prove results of this type is to provide some rule whereby the 
queries made by the algorithm can be used to construct a worst-case input for it. 
The algorithm is thus “hoist by its own curiosity’, and the rule by which this is 
done is called an adversary strategy. 

We explain a strategy sufficient to get the result, in physical terms. Each key will 
be in one of the four conditions V (virgin—never compared), W (won but never 
lost), L Cost but never won), and E (eliminated from contention). For these we 
have the following state diagram: 


SN, 
\ 7 


Imagine keys, represented by marbles, falling through this state diagram. We 
assign each marble an energy equal to its height, so the marbles in V have energy 
2, marbles in W or L have energy 1, and marbles in E have energy 0. The total 
energy of a configuration is the sum of the energies of all the marbles. The 
adversary’s rule is simple: whenever a query is presented, answer it in such a way as 
to minimize energy loss, but consistent with the answers already given. 

Suppose, for example, in computing the max and min of x, y, z, the algorithm 
determines that x < y, giving a configuration of energy 4, and then compares x to 
z. If x <z, the energy becomes 3, whereas if z <x, it drops to 2. Since both 
answers are consistent, the adversary chooses the first alternative, forcing the 
algorithm to make an additional query. 

The algorithm must get from the state in which all marbles are in V (energy 27) 
to a state in which L and W have one marble each, with the rest in E (energy 2). 
Each comparison drops the energy by 0, 1, or 2. Suppose there are M, comparisons 
that drop the energy by i, for 0 <i < 2. Then 


M, + 2M, = 2n-2 
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by energy conservation, whereas 
n/2=>=M,, 


because the adversary can force an energy loss at most 1, except for V-V 
comparisons, of which there can be at most n/2. Using these relations, we find 
that M, + M, = 3n/2 — 2, which gives a lower bound on the number of compar- 
isons. 

We also note that our model suggests an optimal algorithm. For simplicity we 
assume that is even. To reduce the potential energy as quickly as possible, we 
first do n/2 V-V comparisons, then determine the maximum of W and the 
minimum of L. This uses 


n/2 + (n/2—1) + (n/2 — 1) = 3n/2 — 2 


key comparisons, which is best possible. 


4. FINDING THE SECOND LARGEST KEY. We now prove a theorem, due to 
Kislitsyn [6], that any algorithm using pairwise comparisons to find the largest two 
elements among 1 keys must do at least n + log,n — 2 comparisons on some 
input. 

We use a physical model to study algorithms for this problem as well. We think 
of the n Keys as particles, each of which starts off with unit energy. When two keys 
are compared, the result is a collision that transfers all of the energy to one of the 
particles, reducing the other’s to zero. It is easy to see that a key has positive 
energy if and only if it has never lost a comparison. 

By conservation of energy, the maximum key ends up with n units of energy. 
From this, we can see that at least n — 1 comparisons must be made to determine 
the maximum, because every other key must give up its initial energy allotment. 

We now give our adversary the job of making the second largest key as difficult 
to find as possible. The adversary uses Matthew’s rule—“them that gots shall get” 
—in the following way. When two keys with positive energy are compared, the 
winner of the comparison is always one with the largest energy. Ties and queries 
involving keys of zero energy may be handled in any consistent manner. 

Suppose the maximum key was compared to k other keys. Then it must have 
had at least n/2 units of energy before its last comparison, at least n/4 before its 
next-to-last comparison, and so on. Since its initial energy was 1, we must have 
1 >n/2*, so k = log,n. We now observe that k — 1 of the keys compared to the 
maximum must lose one additional comparison, since these must also be proved 
not to be second largest, and the remaining n — k — 1 keys must lose at least one 
comparison. The total number of comparisons is bounded below by 


k+(k-1)+(n-—k-1) =n + logjn — 2. 


Taking into account that the number of comparisons must be integral, it can be 
shown that this bound is best possible. 


5. PATTERN MATCHING MACHINES. We now discuss a method, due to Knuth, 
Morris, and Pratt [8], for finding whether a given string, called the pattern, occurs 
in a text file. Suppose we are looking for AAAA in our file. Then the algorithm 
constructs a finite-state machine, which in this case is 

(start) @9APAPALA — (SUCCess) 


Searching works as follows. If we are in the start state we read the next 
character of the file and advance the state to the right. If we are in a state labelled 
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by a character, we see if the last character read matches it. If so, we read another 
character and advance to the right; if not, we follow the leftward arrow to a 
previous state, without reading a new character. Arrival at the success state signals 
that the pattern has been found. 

The key property of this algorithm is that at most 2m character comparisons are 
needed to search a file of length n, once the machine has been constructed. We 
now prove this using a physical argument. Suppose the pattern has m characters. 
Starting from 0, we number the states so that each successful match increases the 
state number by 1. The states are thus 0,1,2,...,m + 1. Let us agree that when 
the machine is in state i, it contains i units of energy. (Think, if you like, of a 
spring or some other gadget that gets compressed as more and more characters are 
matched.) Each unsuccessful comparison releases at least 1 unit of energy, because 
the state number goes down. On the other hand, over the entire history of the 
algorithm, we feed at most n units of energy into the machine, because reading a 
new character is the only way to put energy in. Thus, there can be at most n 
unsuccessful comparisons. Also, there are at most n successful comparisons, since 
each match causes another input character to be read. Therefore, the total number 
of comparisons is at most 2n. 

One can show that the machine can be constructed in the following “cannibalis- 
tic’ way: search for the pattern inside itself. More precisely, if the pattern is 
P1 ‘°° Pm» We search for it in t = p, -:: p,,. The index of the first pattern character 
compared to the text character p, determines the backward link. Granting this, 
then, at most 2m + 2n character comparisons are needed to find a pattern of 
length m in a file of n characters. 


6. ABSTRACT POTENTIAL FUNCTIONS AND EUCLID’S ALGORITHM. All of 
our examples have the following theme in common. We assign a number, the 
energy, to each possible state of a system, and use this number to draw conclusions 
about how it evolves over time. Even if we don’t look inside the system, but only 
keep track of the changes in energy, we can still write the net energy change as the 
sum of energy changes for the various steps that are taken. 

Linking these energy changes to the computational effort involved in making 
them has turned out to be a powerful tool in the analysis of algorithms. In the 
literature, this is usually called amortized analysis or the potential function method; 
see [13]. In this more abstract approach, we are no longer restricted to state 
functions that come from a physical model. 

I close this article with a potential-function proof that the number of bit 
operations used by Euclid’s algorithm is within a constant factor of the number 
used by the ordinary multiplication algorithm. This result is due to Collins [4]. 

Henceforth, all logarithms will be to the base 2. We also use O(f) to indicate an 
unspecified function that is bounded by a positive constant times f, and |x] to 
denote the greatest integer less than or equal to x. 

We first discuss our cost model. If u is a natural number, its length in binary 
notation is 


Ig u = [log u] + 1. 


When u = 2, we have (Ig u)/2 < log u < lg u. We assign a multiplication of u by 
v the cost (Ug u)(lg v), and a division expressing u = qu +r withO <r < v the cost 
(lg q)Ug v). Up to constant factors, these are the number of bit operations used by 
the ordinary algorithms. 
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We think of Euclid’s algorithm as starting with the inputs u >vu> 0, and 
replacing the pair (u, v) by (v, u mod v), for as long as v is positive. When (d, 0) is 
reached, d is the greatest common divisor of u and v. More details and a proof of 
correctness can be found in [7, pp. 14 ff.]. 

Our task is to prove that Euclid’s algorithm has total cost O((lg u)Ulg v)). The 
idea of the proof is to assign a non-negative potential to each state of the 
algorithm and bound each division’s cost by a constant times the resulting potential 
change. Since we can’t use more potential than we started with, the total potential 
change is bounded, and so is the total computation cost. 

We now choose a potential function. There is a natural tradeoff involved in this 
choice. If we choose a simple function, it may be difficult to relate it to the running 
time of an individual step. A function for which this is easy to do, however, can be 
complicated and hard to work with. Our choice is to assign (u,v) the potential 
(log u)Uog v). Since the algorithm terminates with v = 0, we must treat the last 
division step specially. 

Consider a division step of the algorithm, not the last one. This results in 
u = qu+tr, with 0 <r <_v. The change in potential for such a step is 


(log u) (log v) — (log v)(log r) = (log v) (log (u/r)) 
> (log v) (log (u/v)) [since r < v] 
> (log v)(log q). [since gq <u/v]. 
Unfortunately this doesn’t tell us much when gq = 1, but there is another lower 


bound for this case. Since u = v + r and 0 <r <u, we have u/r >v/rt+ 122. 
Therefore, 


(log u)(log v) — (log v)(log r) = (log v) (log (u/r)) = (log v)(log 2). 
In every division step but the last, then, the potential is reduced by at least 
(Ig v)(Ig @) 

n , 


Now suppose the algorithm uses n division steps. Let E, be the potential after 
the i-th division step. Since all steps but the last have a positive remainder, the cost 
of the first n — 1 steps is 


(log v) (log max {q,2}) = 


n-1 n-1 
dy (lg 9;)(lg u,) <4) (E;-1 — E;) < 4B < 4(lg u) (Ig v). 
i=1 i=1 

The last division costs no more than (lg u)(lg v), so the result is proved. 

By considering how the potential function log u changes after two division 
steps, it can be shown that the number of division steps in Euclid’s algorithm is 
O(lg u). This result is usually ascribed to Lamé, although he was not the first to 
prove it; see [12]. 


7. FURTHER READING. In considering physical models, my original goal was to 
find intuitive ways to verify results that are often proved by calculation. The reader 
may wish to compare the proofs given here with the following treatments in the 
literature. Sedgewick [14, p. 39] related the path lengths of trees by examining how 
they change as a binary tree is built. Knuth [7, p. 405] gave the corresponding 
relation for d-ary trees, without proof. Manber [9, p. 147] emphasized inductive 
design of algorithms, using Huffman’s construction as an example. Baase [1, pp. 
126-133] gave the adversary strategies used in Sections 3 and 4. Cormen, Leiser- 
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son, and Rivest [3, p. 872], following a suggestion of Tarjan [13], used amortization 
to study pattern-matching automata. A potential-function proof of Collins’s result 
appears in [2], but this relied on Lamé’s theorem. 
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Quine’s NF—60 Years On 


Thomas Forster 


Sixty years ago in this MONTHLY, the distinguished American philosopher W.V. 
Quine published a novel approach to set theory. The title was “New Foundations 
for Mathematical Logic” [6]. The diamond anniversary is being commemorated by 
a workshop in Cambridge (England) and comes at a time of rapid increase of 
interest in the alternatives to the hitherto customary Zermelo-Frankel set theory, 
which promises a new lease of life for the axiomatic system now known as ‘NF’; its 
creator remains in good health too. Although he is best known to a wider public 
for his philosophical writings, his most enduring and most concrete legacy for the 
next fifty years may well turn out to be his most mathematical: he gave us NF. 

Set theory is the study of sets, which are the simplest of all mathematical 
entities. Let us illustrate by constrasting sets with groups. Two distinct groups can 
have the same elements and yet be told apart by the way those elements are 
related. Sets are distinguished from all other mathematical fauna by the fact that a 
set is constituted solely by its members: two sets with the same members are the 
same set. To use a bit of jargon from another age, sets are properties in extension. 
As a result, all set theories have the axiom of extensionality: (Vxy)(x =y © 
(Vz(zex<ezey)): they differ in their views on which properties have 
extensions. 

Since set theory first sprang on the scene about a hundred years ago there has 
been a tendency to attempt to use this simplicity to simplify and illuminate the rest 
of mathematics by translating (perhaps a better word is implementing) it into set 
theory. After all, if we can represent all of mathematics as facts about these 
delightfully simple things, some facts about mathematics might become clear that 
would otherwise remain obscure. This same simplicity means that set theory is 
always a good topic on which to try out any new mathematical idea. 

Early twentieth century mathematicians used the expression “The Crisis in 
Foundations”. This crisis had many causes and—despite the disappearance of the 
expression from contemporary speech—has never really been resolved. One of its 
many causes was the increasing formalisation of mathematics, which brought with 
it the realisation that the paradox of the liar could infect even mathematics itself. 
This appears most simply in the form of Russell’s paradox, appropriately in the 
heart of set theory. At first blush one might think that where sets are concerned 
any intension has an extension: this is the axiom of naive set existence. For any 
property of sets there is a set containing precisely the sets with that property, all of 
those and no others. This leads rapidly to Russell’s paradox, the paradox of the 
class of all sets that are not members of themselves. This is the Russell class. Is it a 
member of itself? Well, if it is it isn’t and if it isn’t it is. This is Russell’s paradox. 
The apercu that leapt to mind was that the problem is something to do with the 
possibility of sets being members of themselves, or to do with defining sets in terms 
of membership in themselves. Although these two might sound like two formula- 
tions of the same insight, they nevertheless lead to radically different resolutions, 
and to two traditions in set theory represented by Zermelo-Frankel set theory 
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(often just called “set theory” by its votaries, and in any case universally abbrevi- 
ated to ‘ZF’) and Quine’s NF, which is our primary concern here. 

According to the first view, the source of the trouble manifested in Russell’s 
paradox is thinking of sets as things that even might be members of themselves. 
This critique gives rise to a conception of set (usually called the cumulative 
hierarchy conception) that is very easy to explain to people in a modern computer 
science culture: it is simply the idea that sets form a recursive datatype: 


The empty set is a set; any collection of sets forms a set; nothing else forms 
a set. 


This declaration carries with it a kind of induction principle, as recursive 
datatype declarations always do. If we have an assertion that is true of the empty 
set, and is true of any set x as long as it is true of all x’s members, then it is true of 
all sets. This induction principle is €-induction and is a theorem scheme of ZF. It 
has various consequences, of which one of the easiest to show is that no set is a 
member of itself. Clearly the empty set is not a member of itself. If no member of 
x is self-membered, then x cannot be self-membered either, otherwise x would be 
a self-membered member of x, contradicting the assumption that there aren’t any. 
How does this way of conceiving sets help with Russell’s paradox? Since no set is a 
member of itself, the collection of sets that aren’t members of themselves would 
have to be the collection of all sets, and there can’t be such a thing, since it would 
be a member of itself, and we’ve just used €-induction to show that no set can be a 
member of itself. 

If one had more space it would be natural to expand at this point on how the 
conception of sets as a recursive datatype gives rise to all (well, almost all!) the 
axioms of ZF by using €-induction to show that the recursive datatype is closed 
under operations corresponding to those axioms. However, here the only reason 
for discussing ZF is to explain the difference between the conception of set that 
underlies it and the conception of set that underlies NF. 

The NF conception of sets does not identify the problem behind Russell’s 
paradox as a problem about the kind of set we are going to allow to exist, and 
therefore not as one that can be solved by banishing sets that do not belong to a 
nice recursive datatype. It locates the problem instead in the way the sets are 
defined. It does this by appeal to a concept of type, very closely related to the 
concept of type in modern typed programming languages such as ML. In an ML 
program, it must be possible to assign every variable a consistent type, subject to 
various typing rules; the same idea occurs in NF. Just as in ML, where one assigns 
types to variables in the context of a whole program, in NF one gives types to 
variables in a formula, and does not give a variable a type for life. In NF the types 
are natural numbers, and if the variable ‘x’ in a formula ¢ is given the type n and 
the subformula ‘x © y’ appears in ¢, then we must give ‘y’ the type n + 1. If 
‘x = y’ appears in @ then ‘x’ and ‘y’ must be given the same type. A formula is 
stratified if there is an assignment of types to variables that meets these constraints; 
otherwise it is unstratified. NF’s axioms are now very simply stated: (i) Extensional- 
ity; (ji) a scheme that says that the extension of a stratified formula is a set. 

Let’s try this on =(x € x). Clearly we will end up trying to give ‘x’ two distinct 
types and concluding that the formula is untyped. Therefore there is no axiom of 
NF saying that the collection of all sets that are not members of themselves is a 
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set, and so, prima facie, no paradox. The other paradoxes are all held at bay in the 
same way. I am careful not to say that they are avoided, for it is an open question 
whether or not NF is consistent, but they are all held at bay in the sense that the 
obvious derivation for each paradox relies on a set-existence axiom that is not 
available in NF because the relevant formula is not stratified. 

So far so good: stratification seems to prevent the usual paradoxes from being 
derivable, but are there any deep reasons why one would expect it to have this 
effect, or is it just a happy—and perhaps merely temporary—coincidence? Natu- 
rally people have tried to find reasons why stratification ought to work in this way, 
and it turns out that stratification is not a purely syntactical notion. To explain 
why, we need a device first used by Bernays and Rieger to prove the independence 
of the axiom of foundation from ZF. A model 4 of set theory is a class with a 
binary relation on it, typically written (M, € >. Now let 7 be a permutation of M, 
and associate with M a new relation, which holds between x and y precisely if 
x © w(y). If there is a universal set in the model (M, € > then there is one in the 
new structure too, because if V was the universal set of (M. € ) then 77 !(V) will 
be the universal set under the new dispensation. The assertion that there is a 
universal set is stratified, and it turns out that not only is the assertion that there is 
a universal set preserved by such redefinitions of the membership relation by 
permutations, but also every stratified assertion is thus preserved. (Subject to some 
small print the converse is true too: every sentence thus preserved is equivalent to 
a stratified formula.) Although this equivalence tells us that the apparently purely 
syntactical concept of stratification does have some semantical significance, it 
doesn’t seem to tell us that this significance has anything to do with the avoidance 
of paradox. The clearest manifestation of this gap in our understanding is that our 
insight about the meaning of stratification has not yet given rise to a consistency 
proof for NF. 

The feeling among modern WNFistes is that this fact about stratified formule 
(which I like to think of as a completeness theorem since it identifies a semantical 
and a syntactic property) is nevertheless something that should be taken seriously. 
The argument runs like this: I said just now that a model of set theory is a set 
(M, say) with a binary relation (R, say) on it. For present purposes we want to 
think of a model of set theory as a set M of atoms (things with no internal 
structure) associated with an injective map i:M ~ A(M), from M into the power 
set of M, so that the original R associated with M can be recovered as the relation 
a © i(b) (where ‘ €’ is the membership relation of the real world in which we who 
are contemplating the model reside). We can think of i as a coding function: each 
a © A “codes” a subset of A, namely {x € A:x © i(a)}. We know from Cantor’s 
theorem (every set is smaller than its power set) that not every subset of A can be 
coded by a member of A, so in constructing a model of set theory we have to leave 
some sets of atoms uncoded by atoms. A decision on what injection i to associate 
with A is (among other things) a decision about which collections of atoms are to 
be sets. Now revisit the idea of the “permutation models” of the preceding 
paragraph. If 7 is again a permutation of A then we can define a to be a member 
of b not if (as at the start of this paragraph) a € i(b) but instead if a € i(m(b)), 
and we obtain another model of set theory. What is the difference between these 
two models? Well (since i and ie 7 have the same range) they have made the same 
decision about which classes of atoms are to be sets, but different views on how 
that decision is to be implemented: the same collections of atoms are to be sets of 
the model, it is just that they are not necessarily going to be coded by the same 
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elements of A as before. Accordingly the general feeling among NFistes is that 
stratification is the syntactical arm of a gang of concepts to do with what computer 
scientists call implementation-invariance. 

But this is all very unhistorical. Let us go back to the years following 1937. NF 
was born in interesting times, and the West had other things on its mind during 
NF’s youth. The first really interesting development did not take place until 1953, 
when E.P. Specker in Zurich showed that NF refuted the axiom of choice and 
thereby proved the axiom of infinity [7]. This result was a most mysterious and 
disquieting one, best approached in the context of another result of Specker’s, nine 
years later, that is in many ways more illuminating. 

Specker’s 1962 paper [9] connects NF with Russellian type theory in a way that 
neatly turns back the clock about 50 years. The syntax of Russell’s type theory is 
very nasty, but the elements needed to tell its story can be recounted relatively 
easily. In Russell’s type theory, as simplified by Ramsey, every set belongs to a type. 
The bottom type is a type of atoms, and thereafter type n + 1 consists of sets of 
things of type n. Every variable of the theory is constrained to range over one level 
only. Accordingly no allegation that the collection of all sets that aren’t members of 
themselves is a set can even be formulated in this sort of theory, let alone proved. 
That fact was the attraction; there are of course drawbacks as well. One is that we 
thereby chuck out the baby with the bathwater, in the sense that as well as 
rendering unsayable things like the existence of the Russell class we also make 
certain apparently entirely innocent things unsayable as well. A specific conse- 
quence is that the Russell-Ramsey theory makes all sorts of assertions that look 
very similar but are actually distinct, even though in some sense one feels that they 
ought not to be. For example (according to Russellian type theory) there is no 
single empty set but an empty set at each type. The language does not enable us to 
say anything like (Ax)\(VyMy € x). But it can say (Ax,)(Vy Myo € x1), 
(Ax, Vy My, € x2), (Ax,)V yy, € x3)... and so on, where the subscripts are 
type subscripts. The language clearly has an endomorphism executed as follows: 
take a formula, increase all the type subscripts in it by 1. The result is a new 
formula, written ‘f*’ if the first formula was ‘dé’. What is the relation between 
and * ? In [8] Specker drew a parallel with projective geometry, which also has an 
automorphism like this. By interchanging ‘point’ and ‘line’, and interchanging ‘lie 
on’ with ‘meet at’ one can transform an assertion @ of projective geometry into 
another assertion of projective geometry, which is standardly called the dual of the 
first, and is written @. It is standard that the dual of an axiom of projective 
geometry is another axiom. By induction on proofs one shows that the dual of a 
theorem is a theorem. But is geo d a theorem? It is not obvious one way or the 
other. In the case of projective geometry the story has a neat solution and a happy 
ending (the scheme ¢ © ¢ is equivalent to Desargues’ theorem), but in the type 
theory case it is more interesting, and not just because now the ‘+’ operation is not 
an involution. It is certainly the case that @* is an axiom whenever @ is, and ¢* 
is a theorem whenever ¢@ is, but is ¢ @ ¢* always a theorem? The example of the 
infinitely many statements saying that there is an empty set at each type is one that 
suggests very strongly that ¢ < * ought to be a theorem! 

It turns out that the scheme ¢ © ¢” is not a theorem of Russellian type theory 
but that it is consistent with Russellian type theory if and only if NF is consistent: 
this is Specker’s 1962 theorem. This is very fitting when one reminds oneself of 
Quine’s thinking behind the set existence axiom of NF. Quine’s view—expressed in 
this MONTHLY 60 years ago—was that the type discipline that banished the 
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paradoxes from type theory did so by making it impossible to formulate certain set 
existence axioms (like that giving the Russell class), and that making multiple 
copies—one at each type—of apparently perfectly nonproblematic sets like the 
empty set is an unwanted side effect and not part of the solution. If we can avoid 
some of this duplication by means of judicious polymorphism then this is all to the 
good. The result was that Quine kept the type distinctions but instead of enforcing 
them at the level of syntax (so that ‘x € x’ would be illformed, as in Russellian type 
theory) enforced them merely at the stage of axioms of set existence, so that 
‘x © x’ is wellformed, but its extension is not a set. A modern way to describe this 
development is to say that Quine obtained NF from Russellian type theory by 
relaxing its syntactic constraints by a bit of polymorphism, and that Specker’s 1962 
theorem makes this fact formal and explicit. 

One consequence of Specker’s discovery was the involvement of proof theory in 
NF studies. Any proof in NF of a stratified formula corresponds to a proof of a 
version of that formula (with type subscripts glued on) in Russellian type theory 
with a scheme of polymorphism: “from + ¢@ deduce + * and vice versa’. This 
interchangeability relates the proof theory of NF to the proof theory of type 
theory and thereby places NF studies firmly in the mainstream of modern 
theoretical computer science. Once NF has been placed in such a context, it is 
natural to think about what happens to the ideas that gave rise to its birth if they 
are approached constructively. It is then natural in turn to see if the strange 
derivation of the axiom of infinity works from a constructive standpoint. It turns 
out that there is a sensible constructive version of NF in which we can prove that 
it is not the case that every set is finite, but (since constructively = Vxp is not the 
same as dx — p) we cannot—apparently—prove that there is an infinite set. When 
working with classical logic we are of course not hampered in this way, and if we 
can show that not every set is finite then V, the universe, is certainly infinite. Now 
according to NF V. is a set (it is the extension of the expression ‘x = x’ which is 
certainly stratified) and so too is its quotient under the equivalence relation “is the 
same size as”. This quotient will also be infinite, and it will give us an implementa- 
tion of the natural numbers. The contrast between the classical case and the 
constructive case, where although we can prove that not every set is finite, there 
doesn’t appear to be any one set whose infinitude can be proved (and so we 
apparently cannot obtain an implementation of the natural numbers), suggests that 
it may be possible to prove the consistency of constructive NF by much simpler 
methods than will be needed to prove the consistency of NF itself. 

There are other subsystems of NF for which we can in fact do more than 
merely piously hope for consistency proofs. Most of these achieve their consistency 
by restricting the number of comprehension axioms in one way or another. For 
example NF’, has axioms to say that the universe is a boolean algebra under C 
and that {x} is always a set; NFO has in addition an axiom saying that {y: x © y} is 
a set. (The operation sending x to {y: x € y} enables us to show by induction on @ 
that {x: d(x, y,... y,)} is a set as long as ¢ is stratified and quantifier-free, and it 
is actually an © -isomorphism!) NF, allows {x:¢} as long as the corresponding set 
existence axiom can be stratified with no more than 3 types. There is also a pair of 
theories arising from a third version of the circularity critique: perhaps it is 
necessary not only to create sets in order (as we do in the cumulative hierarchy 
conception) so that each set consists only of sets created earlier, but also to restrict 
the ways in which we specify sets so that we can form {x:¢} only if @ not only does 
not hold of things created later, but does not even quantify over sets created later. 
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The idea is that we should be allowed to form {x:¢} only if checking that x has the 
property @ does not involve examining sets we have not yet created. Set existence 
axioms obeying such a constraint are said to be predicative and it has been known 
for a long time that adding predicativity constraints makes consistency much easier 
to prove. 

But the most interesting subsystem of NF doesn’t arise in this way and was 
totally unexpected. This was NFU, uncovered by R.B. Jensen in 1969. If one 
weakens the extensionality axiom that is so central to set theory to allow for 
distinct empty sets (‘U’ for “Urelemente”, which is what set theorists call empty 
sets: they are certainly very hard to tell apart!) but retains it for nonempty sets one 
obtains the system NFU. The corresponding manzuvre in ZF results in a system 
that is equiconsistent with ZF and was—before the development of forcing by 
Cohen in the 60’s—used for independence proofs for the axiom of choice and the 
like. When we weaken NF to allow urelemente the effect is dramatically different: 
NFU is provably consistent and is very weak indeed, too weak to prove the axiom 
of infinity. 

One could view the consistency of NFU merely as a vindication of Quine’s 
insight that the type disciplines are enough by themselves to banish the paradoxes, 
even if we flirt with danger by playing with a bit of polymorphism, as does Holmes 
[3]. Although it certainly is such a vindication, it raises bigger questions than it 
answers. After all, if type disciplines are enough to put paradox to flight even when 
relaxed with polymorphism, why is there this dramatic difference in strength 
between NF with and without atoms? Clearly there is something else going on. 
(There is even the ghastly and largely unspoken possibility that the consistency of 
NFU might have nothing to do with stratification at all, but is purely the result of 
weakening extensionality (and thereby betraying set theory) and that even though 
NFU is consistent, NF itself isn’t.) 

But even if we do not yet understand clearly why NFU is so much weaker than 
NF, we can at least start to put this new system to use [4]. There is for the moment 
a great interest in alternatives to ZF, driven by the feeling that certain structures 
with non-wellfounded relations on them ought to be represented by sets. 
(A relation R ona set x is wellfounded if and only if for every nonempty subset 
X'CX (Ay € X'\Vx € X' ACR(x, y)).) For a long time the standard implemen- 
tation of ordinal numbers in ZF has been one that arranges for the (wellfounded) 
relation < between ordinal numbers to be implemented by ©€ , and the idea is 
abroad that all binary relations between mathematical objects of interest should 
be thus representable by © between the sets chosen to implement those mathe- 
matical objects. Under the recursive datatype conception of sets (as in ZF) we can 
prove easily that © is a wellfounded relation on the universe of all sets. 
Consequently there is no possibility of representing the kind of illfounded relations 
that appear in computer science as relations between sets of ZF. 

What is a suitable framework for this? A fashionable candidate about which a 
lot has been written recently is ZF with “antifoundation” axioms, of which a racy 
and entertaining treatment can be found in the recently published book [1]. 
Antifoundation axioms ensure that all binary relations between mathematical 
objects of interest are representable by © between the sets chosen to implement 
those mathematical objects. In a way this is a very unidiomatic thing to do to ZF. 
As we noted earlier, the recursive datatype conception of sets entails that © isa 
wellfounded relation. It is surely perverse to develop an axiomatic set theory on 
the basis of one conception of set, and then throw away that conception by 
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adopting axioms that are incompatible with it—thereby rendering suspect all the 
axioms it gave rise to. If we are to postulate sets that are forbidden by the 
recursive-datatype conception, then there is no point in looking to axioms arising 
from that conception to tell us how those sets are going to behave. Surely it makes 
more sense to have axioms of set existence that never owed anything to that 
conception in the first place. Such a set of axioms is to be found in NFU. 

Can NFU in addition provide a set theoretic framework containing © -copies 
of all the structures we can describe, as postulated by the antifoundation axioms? 
It turns out that for various technical reasons antifoundation axioms are not 
consistent with NFU as they stand. They need to be restricted to hereditarily 
small sets. (A set is hereditarily small if and only if it is a small set of hereditarily 
small sets.) What is a small set? Fortunately there is an embarras de richesse of 
direct concepts of smallness: we could say that x is small if and only if x is 
wellordered, or if x is the same size as a wellfounded set, or x cannot be mapped 
onto the universal set, or is smaller than its power set. These last two seem a bit 
odd, but are actually quite natural in the context of NFU. According to NFU the 
universe is a set. Therefore Cantor’s theorem, which says that every set is smaller 
than its power set, must fail. But it succeeds for some sets, and these typically tend 
to be smaller than those for which it fails. A slightly smoother notion is strongly 
cantorian. A set x is strongly cantorian if and only if the restriction of the singleton 
function to x is a set. Theorems of Jensen [5] and Holmes [3] tell us that the 
hereditarily strongly cantorian sets can be almost any ZF-style model we want. 
A place to look for substructures of models of NFU in which every set is small and 
antifoundation axioms are true would perhaps be the greatest fixed point for the 
operation x +> the set of small subsets of x. The least fixed point consists entirely 
of wellfounded sets and satisfies foundation rather than antifoundation. 

There is no space in a brief retrospective like this to give adequate pointers to 
all the relevant literature, and I am uncomfortably aware that the work of my 
Doktorvater Maurice Boffa, the unofficial head of the Belgian school of NFistes is 
underrepresented in this survey, as is his collaboration with Marcel Crabbé and his 
role in furthering NF studies by supervising André Pétry and Roland Hinnion. 
Nobody likes to appear to be promoting his own work unduly, but sadly it really is 
true that the only book-length treatment of NF is [2]. This book also contains 
treatments of permutation models and all the subsystems of NF mentioned in this 
article. Fortunately for readers who have access to the web there is also Randall 
Holmes’ NF website at http: // math.idbsu.edu/ faculty /holmes.html, 
which contains an exhaustive bibliography, links to other workers on NF, and 
Holmes’ introduction to NFU. 
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Remarks on Sharkovsky’s Theorem 


Michat Misiurewicz 


Recent publication of a paper on Sharkovsky’s Theorem in this MONTHLY [8] is a 
good occasion for making several historical comments on this beautiful theorem. 
The original paper of Sharkovsky [12] was published in Russian and has been 
translated into English only recently [13]. As a result, some authors citing [12] may 
be not fully aware of the contents of this paper. Moreover, there was a subsequent 
paper by Sharkovsky [14], that in some sense completed his theorem. 
Consider the Sharkousky ordering of the set of natural numbers: 


3~35~K TK 9K K 3526552 ~K7°2< 9-26 = 
< 3:27 <5:+2*°<~7:2?<9-2* q +) © 2° «2?«q2<«1. 
Let J be either the real line or an interval. If f:7 — J is a continuous map, then a 
set P = {x,,X5,...,xX,} such that f(x,) =x,, f(x.) =x3,..., f(x,) =), is called 
a cycle or a periodic orbit. The period of a cycle P is the number of its elements. 
The three parts of the full Sharkovsky Theorem are: 


Theorem 1. Let f: I — I be a continuous map. If f has a cycle of period n and if n 
appears before k in the Sharkousky ordering, then f has a cycle of period k. 


Theorem 2. For every k there exists a continuous map f: I > I that has a cycle of 
period k, but has no cycles of period n for any n appearing before k in the Sharkousky 
ordering. 


Theorem 3. There exists a continuous map f: I — I that has a cycle of period 2” for 
every n and has no cycles of any other periods. 


In most papers and books dealing with Sharkovsky’s Theorem, this name is 
applied only to Theorem 1. However, the original statement of Sharkovsky’s 
Theorem is stronger. It is equivalent to Theorem 1 plus the assertion that if n 
appears before k in the Sharkovsky ordering then there exists a continuous map 
f: I> TI with a cycle of period k but with no cycle of period n. Moreover, the 
arguments given in [12] also prove Theorem 2. Theorem 3 is proved in [14]. Thus, 
“Sharkovsky’s Theorem” properly refers to the union of all three theorems. 

The first proofs of Theorem 1 were difficult to follow. I remember that when I 
learned of this theorem, I tried to read the proof in [12]. The idea was clear, but 
the details were messy. This was apparently also an impression of Stefan, who 
wrote another proof [15]. However, when I tried to read Stefan’s proof, I also 
found that the idea was clear, but the details were messy. Therefore I decided to 
write my own proof. When I tried to read it several months later, I realized that I 
did no better: the idea was clear, but the details were messy. The standard proof is 
now easy to follow complete detail; it was discovered almost simultaneously by 
many mathematicians (see e.g. [3], [4], [10], [16]). 

The standard proof of Theorem 2 uses examples of maps having only cycles of 
odd period n and periods following n in the Sharkovsky ordering, and the “square 
root” construction. Various presentations of this proof (including [8]) are only small 
modifications of the proof in [12]. Many of them leave details to the reader (as for 
instance in [7, pp. 66-68]). 
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There was an interesting question connected with these proofs. Suppose f has a 
cycle P of period nm and no cycle of any period preceding n in the Sharkovsky 
ordering. What does P look like? This question has been answered in [2], [6], and 
[9]. Problems of this type led to the development of combinatorial dynamics. 

To prove Theorem 3, one has to give an example of what is called a map of type 
2°. Several kinds of examples are known, but the most important are the ones that 
are smooth and unimodal (“unimodal” means “with one interior local extremum”). 
For these maps, as well as for one-parameter families containing them, one 
observes interesting geometric structure both in the parameter space and on the 
interval. This observation led to the development of the so called Feigenbaum 
Theory (see [5, pp. 199-238]). A very short proof of Theorems 2 and 3 together can 
be given by looking at the family of truncated tent maps (trapezoidal maps); see 
[1]. However, this proof is not constructive. The real career of Sharkovsky’s 
Theorem began with the publication of the paper Period three implies chaos by Li 
and Yorke in this MONTHLY [11], although the authors did not even know about 
Sharkovsky’s Theorem when they wrote their paper. 
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Correction to: Zaphod Beeblebrox’s Brain 
and the Fifty-Ninth Row of Pascal’s Triangle 


Andrew Granville 


In my paper [1], we studied Pascal’s Triangle modulo 2,4,8 and 16; and, in 
particular, its self-similar structure. It is well-known that the number of entries 
= 1 mod 2 in the nth row of Pascal’s Triangle is 272, where #,(n) is the number 
of ‘1’s in the binary expansion of n. The proof, developed in our article, observed 
that if 7, denotes the top 2* rows of Pascal’s Triangle (mod 2), then 


The 41 = 


and proceeded by induction. 


We discovered by experiment that a similar rule works with Pascal’s Triangle 
(mod 4): if there are not two consecutive ‘1’s in the binary expansion of n then 
there are 272 entries = 1 (mod 4), and no entries = —1 (mod 4), in the nth 
row of Pascal’s Triangle. On the other hand if there are two consecutive ‘1’s in the 
binary expansion of n then there are 2727! entries = 1 (mod 4), and 2727! 
entries = —1 (mod 4), in the nth row of Pascal’s Triangle. We proved this by 
noting that if 


PNP LNPN 


where Vis the transpose of V,, and again proceeding by induction. 

The main point of [1] was that there is a version of this “self-similarity” modulo 
4 (with an easily understood change of states) modulo any prime power. This also 
allowed us to prove that the number of entries = a (mod 2”) with a odd, b < 2 in 
any row of Pascal’s Triangle, is either 0 or a power of 2. This is also true for b'= 3, 
though the proof that we gave in [1] was faulty. Surprisingly though, b = 4 is an 
exceptional case, since exactly six entries in Row 59 of Pascal’s Triangle are = 1 
(mod 16). 
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In [1], we proved that if D, denotes the top 2* rows of Pascal’s Triangle 
(mod 8), with 


and if 


then 
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We then attempted (see (3) in [1]) to write down all the cases for how the binary 
expansion of m determines the number of entries in row n that are b = a (mod 8), 
for a = 1,3,5 and 7; and proceeded by induction on the binary digits of n, based 
on the evolution of D, into E,. Although this method is certainly valid, Fred 
Howard and Ken Davis, and Jim Huard, Blair Spearman, and Ken Williams [2] all 
observed that we failed to enumerate the cases correctly; thanks to all of them for 
such a careful reading of my paper. Here is a correct enumeration of those cases, 
which follows by induction. 

Let (n), denote the binary expansion of n. If (nm), begins with “11”, then we 
may need to “cut” the row up into four quadrants; we can discuss the first two 
quadrants only because of the horizontal symmetry of Pascal’s Triangle. All of the 
statements preceding Figure 12 in [1] remain unchanged: 


¢ If (n), contains no 11 and no 101 then all odd entries are = 1 (mod 8) 

¢ If (n), contains no 11 but does contain a 101 then there are an equal number 
of entries = 1 (mod 8) and = 5 (mod 8) and no other odd entries. 

¢ If (n), contains a 1111, or it contains both a 11 and a 101, then there are an 
equal number of entries = 1,3,5 and 7 (mod 8), and similarly in each 
quadrant (when relevant). 


If n does not belong to any of these cases then, in binary, it has the form 
(n), =011..1 00.0 1...1... 00.0 1...10...0. 


t, 1’s u,0’s ty 1’s U,»-1 O'S t,y,l’s u 


Here 1 <¢; < 3 for each j, and u; > 2 for 1 <i < m — 1. It is at this point that 
we part company from [1], where we failed to distinguish certain cases that do 
arise. Note that we have started (n), with a single ’0’, followed by the usual 1’. For 
example, (825), = 01100111001. 

Henceforth, we: may assume that (m), contains no 1111 nor a 101, but does 
contain a 11. 


¢ If (nm), contains no 111, and n is even or nm = 1 (mod 4), then there are an 
equal number of entries = 1 (mod 8) and =7 (mod 8) and no other odd 
entries. 

¢ If (nm), contains no 111 nor a 0110, and m = 3 (mod 8), then there are an 
equal number of entries = 1 (mod 8) and = 3 (mod 8) and no other odd 
entries. 

¢ If (nm), contains a 111 and no 0110, and n #7 (mod 8), then there are an 
equal number of entries = 1 (mod 8) and =3 (mod 8) and no other odd 
entries. 

¢ Otherwise there are an equal number of entries 1,3,5 and 7 (mod 8). 


It is possible to explain this induction proof succinctly: Let S, ¢ {1,3,5, 7} be 
the set of residue classes a (mod 8) such that there exists an integer j for which 


(; =a (mod 8). First note that $,; = {1} for all j => 1, S, = {1,3}, and S,;_, = 


{1, 3,5, 7} for j = 3. By studying the evolution of E, from D, we see that 


whenever (7), contains a 101 then 5S, = S_, 
whenever (7), contains a 0110 then 7S, = S,, 


whenever (7), contains a 01110 then 3S, = S, and 
whenever (7), contains a 1111 then S, = {1,3,5, 7}. 
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These observations account for the composition of S,. If a,b € S,, then the 
number of entries in the mth row that are =a (mod 8) is equal to the number of 
entries that are = b (mod 8). 

Combining these observations by studying the binary expansion of n finishes the 
proof. 
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NOTES 


Edited by Jimmie D. Lawson and William Adkins 


A Simple Formula for 7 


Victor Adamchik and Stan Wagon 


Dedicated to the memory of Tom Tymoczko (1943-1996), an innovative investigator 
into the nature of computer proofs 


1. THE RADICAL BBP IDEA. In 1995 David Bailey, Peter Borwein, and Simon 
Plouffe [2] discovered the following shocking formula for 7: 


ed 4 2 1 1 
cog LOX \ 8K +1 8kK+4 8kK+5 8k+6} 


This result is shocking because it can be used to generate the nth base-16 digit of 
q7 without having to look at any prior digits. And, so long as nis less than a billion 
or so, the entire computation can be carried out with 16-digit numbers. This is a 
radical idea, since all previous algorithms for the nth digit of a required the 
computation of all previous digits, and the use of d-digit arithmetic in the 
computation. For more details of the fairly easy argument that leads from the BBP 
formula to an algorithm for far-out hex digits of az see [1] or [2]. 

Proving the BBP formula is not difficult. But that misses the main point: How 
did they find it? In short, they had a hunch that such a formula might exist and 
they searched for it using high-precision approximate reals, a high-performance 
SGI workstation, and the PSLQ algorithm [3], [4]. In this note we show how a 
simpler formula of this type can be discovered in such a way that a proof 
accompanies the discovery. We will present only a single result. Several more 
formulas of this type can be found [1]. 

Before leaving the BBP formula, here is a proof that it is correct using 
Mathematica to perform the summation. 


FullSimplify[TrigToExp|FullSimplifyl 


yop L6*\8k+1 8k+4 8k+5 8k+6 


a— Log|b_]+ a— Log[c_]:>a Loglb cll. 


qT 


This “proof” is of very little value, for it gives us no insight whatsoever. Some 
might even say that it is not truly a proof! But in principle, such a computation can 
be viewed as a proof. There are some subtleties. Some types of computations come 
along with certificates that allow verification; for example, if a computer churns 
out an indefinite integral, the result can be differentiated to see if it agrees with 
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the integrand. Mathematica does not provide such certificates for sums, but recent 
work of Wilf and Zeilberger has shown that certain sums, such as the ones that 
occur here, do carry certificates, and implementing the production and verification 
of such certificates has in fact been done (see [5]). So it is true that, provided one 
uses the latest work on symbolic summation algorithms, a computation such as the 
above can be taken to be a proof. 

But such a proof is not very helpful. The real power of sophisticated symbolic 
software is that this first computation provides the starting point for an investiga- 
tion that yields both deeper understanding of the formula and, with luck, some 
new formulas. That sort of investigation is what we carry out here. It turns out that 
the sums that arise in this note can be transformed to integrals, and then 
antiderivatives can be checked so that a proper standard of proof is maintained. 
We show how to do that at the end of Section 2, but we start our work with the 
reasonable working assumption that the results of Sum are correct. 


2. DISCOVERY AND PROOF. Suppose we wish to see if 7 can be expressed in 
the following form (we examined several such forms and are presenting here the 
simplest one that worked). 
7= s EP SH 7 @ . 8 
coo «64 CON 4K +100 4K4200 4kK4+300 4k 44) 
We just feed the general sum to Mathematica (we used version 3.0.0; other versions 
may yield slightly different forms). 


Simplify [FunctionExpand [ 


- (-1)*/ a, a, a, a, 
y + + + ji 
Ak+1 4k+2 4k+3 4k+4 


1 
g (2a la ArcCot[2]-a, Log[4]+a, Log[5] + 


a,(7/4+ArcCot[3]-Log[25]/4)))+ 
a,(m7+4 ArcCot[3]+Log[25]) ) 


Now we make some simplifications, the last one based on the identity arctan 1 + 
arctan 2 + arctan3 = 7. 


Expand[% / .{Log[25] — 2 Log[5], Log[4] — 2 Log[2], 
ArcCot [x_]:>(7/ 2-ArcTan[x])}/. 
ArcTan[3] — (3 7/4-ArcTan[2])] 

a7 1 


> tbh ArcTan[2]-a, ArcTan[2]+a, ArcTan[2]- 


1 1 
2a, Log[2] +7 Log [51-5 as Log[5] +a, Log[5] 
Collect[%, {aw, ArcTan[2], Log[5], Log[2]}] 
a, a, a, a, 
+|—-a,+ a, Jarctan (21 -2a,Log[2] +|—-— +a,]Log[5] 
2 2 4 2 
Now we simply search for a-values that cause all but the first summand to vanish, 


1997] NOTES 853 


and the first to equal a. This is easily done by hand, but since Mathematica is 
running: 


a 


a2 1 a, as 
Solve]; — ==1, —-a,+a,==0, a,==0, —-—+a,==0 
2 2 4 2 


{{a, ~ 2, a, ~>2, a,7l1, a, OF}. 
And so we have a new formula for 77: 
: (-1)*/ 2 2 1 
r= ), ——-|—-——- + + 
nog «O48 CN 4K +100 4k 4200 «4k 43 
We reiterate that the proof comes for free along with the discovery (though for a 
rigorous proof one might prefer to use integrals instead of sums, as we discuss in a 
moment). As with BBP, our formula can be used in a digit-extraction scheme in 
base 4. Of course, digit extraction in base 4 is fully equivalent to the base-16 case. 
This method of undetermined coefficients can also be used to generate the BBP 
formula; we leave that task to the reader who wishes to exercise a computer 
algebra system. 

Further explorations along these lines seem to be easier if one uses integrals 
instead of series. This also eases the task of producing a verifiable proof. Such a 
transformation, focussing on the base-4 case under discussion, is carried out as 
follows. 


1 
4k +i 


0 (| k 
1. Define gi) = > | > 
k-0 4 


2. ei) = ¥ V2! i 1/V2(_qykz4k+inl gy (easy integration). 
k=0 0 
3. gi) = V2! [7 Yo (—1)kz4#*-) dz (interchange). 
0 k=0 


i-1 
, Zz 
4. gli) = v2! i Vv " dz (geometric series). 


5. Use undetermined coefficients a; with (4) and call on either a computer or an 
integration expert to get a closed-form expression for L7_.,a,g(i). It will agree with 
the four-transcendental expression we obtained at the beginning of this section. 


Of course, many other forms can be investigated in the hope of getting more 
formulas for 7 or other constants. Sadly, it seems as if these ideas may not lead to 
a formula that allows extraction of base-10 digits. From one point of view, the 
crucial miracle that makes the above formulas work is that certain arctangents that 
arise are rational multiples of 77. J. Buhler has shown that this can happen only in 
situations that are essentially equivalent to the ones above; in particular, no 
base-10 formula relying on this particular phenomenon exists. Still, there might be 
other numerical miracles that could give a base-10 formula or, more generally, 
other kinds of formulas or techniques for rapid extraction of base-10 digits. 
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Borsuk-Ulam Implies Brouwer: 
A Direct Construction 


Francis Edward Su 


1. INTRODUCTION. The Borsuk-Ulam theorem and the Brouwer fixed point 
theorem are well-known theorems of topology with a very similar flavor. Both are 
non-constructive existence results with somewhat surprising conclusions. Most 
topology textbooks that cover these theorems (e.g., [4], [5], [6]) do not mention the 
two are related—although, in fact, the Borsuk-Ulam theorem implies the Brouwer 
Fixed Point Theorem. 

The theorems themselves are often proved using the machinery of algebraic 
topology or the concept of degree of a map. That one theorem implies the other 
can therefore be established once one understands this machinery, but this 
requires background. Moreover, such proofs tend to be indirect, relying on the 
equivalence of these existence theorems with corresponding non-existence theo- 
rems. For instance, Dugundji and Granas [3] show that the Borsuk-Ulam theorem 
is equivalent to the statement that no antipode-preserving, continuous map f: 
S” — S$” can be homotopic to a constant map. From this one can see that the 
Brouwer fixed point theorem is a special case, because it can be shown equivalent 
to the statement that the identity map id: $” — S$” (which is antipode-preserving) 
is not homotopic to a constant map. 

However, such an indirect approach is not really necessary, and perhaps a more 
direct proof would give insight as to how the two theorems are related. The 
purpose of this note is to provide a completely elementary proof that the Borsuk- 
Ulam theorem implies the Brouwer theorem by a direct construction, in which the 
existence of antipodal points in one theorem yields the asserted fixed point in the 
other. 


2. THE THEOREMS. Let S” denote the unit n-sphere in R"*', i.e., all points at 
distance one from the origin. Two points are antipodal if they lie opposite each 
other on the sphere—i.e., {x, —x} for some x. 
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The Borsuk-Ulam Theorem. Let f:S” — R” be a continuous map. There exists a 
pair of antipodal points on S" that are mapped by f to the same point in RR”. 


This theorem was conjectured by S. Ulam and proved by K. Borsuk [1] in 1933. 
In particular, it says that if f = (f,, f,,...,f,) is a set of m continuous real-valued 
functions on the sphere, then there must be antipodal points on which all the 
functions agree. For instance, one interpretation for the case n = 2 is that there is 
always a pair of antipodal points on the earth’s surface with the same temperature 
and barometric pressure (assuming, of course, that temperature and pressure vary 
continuously). 

Let B” denote the unit n-ball in R”. A fixed point for a map f from a space 
into itself is a point y such that f(y) = y. The following theorem, due to L.E.J. 
Brouwer, is one of the most celebrated theorems in topology: 


The Brouwer Fixed Point Theorem. Every continuous map f:B” — B” possesses a 
fixed point. 


Brouwer proved the case n = 3 in 1909, and Hadamard followed in 1910 with a 
proof for all dimensions. Brouwer gave a different proof in 1912 [2]. See [3] for 
more historical notes and a survey of fixed point theory. 

In dimension three, the Brouwer theorem is often interpreted as follows: no 
matter how you slosh around the coffee in a coffee cup (as long as you do it 
continuously), some point is always in the same position it was before the sloshing 
took place (although it might have moved around in the meantime). Moreover, 
should you try to move this point out of its original position, you will unavoidably 
move some other point back into its original position. 


3. THE IDEA. As motivation we first briefly sketch a construction that shows how 
the Borsuk-Ulam theorem implies the Brouwer fixed point theorem. 

We choose to think of B” as [—1, 1]”, the “n-cube” in R”. Similarly, we choose 
to think of S” as the boundary of the (n + 1)-cube [—1,1]"*! in R"*?: 


S” = {x |x = (%1,%55---,Xn41), |x;| < 1 and max |x;| = 1}. 


The “cubical” n-sphere is homeomorphic to the usual n-sphere via the rays from 
the origin. In fact, this is an antipode-preserving homeomorphism, so the Borsuk- 
Ulam theorem holds for maps on cubical n-spheres. We choose to work with 
cubical n-spheres and n-balls because constructing and describing functions on 
such objects is easier in rectangular coordinates. 

Given f: B” — B”, we would like to construct a map g: S” — IR” that encodes 
f in such a manner that the existence of Borsuk-Ulam antipodal points for g 
implies the existence of a Brouwer fixed point for f. 

The idea is as follows: on the cubical n-sphere, the “top” and “bottom” faces of 
the cube are homeomorphic copies of B”. These are separated by an “equatorial” 
band, consisting of the other faces. Our task is to define a continuous function g 
on these three regions of S$”. On the top face, we define g in such a way that a 
zero of g implies a fixed point for f. We then define g on the bottom face so that 
the image of each point there is the negative of the image of its antipode on the 
top face. Such map is called antipode-preserving—meaning g(—x) = —g(x). If we 
can patch-in the equatorial region with a map that is also antipode-preserving but 
never zero, then the Borsuk-Ulam Theorem guarantees the existence of antipodal 
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points that get mapped by g to the same point. Because g is antipode-preserving, 
these antipodal points must get mapped to zero, which by construction cannot 
occur in the equatorial band. A zero for g on the top or bottom face then implies 
a fixed point for f. 


4. THE CONSTRUCTION. We seek to construct a map g = (g1, 8,...,; 8,): 
S” — IR” that is continuous and antipode-preserving, i.e., g(—x) = —g(x). 

We first construct g on the “top” and “bottom” faces. Note that each face, on 
which some coordinate x, = +1, is an n-cube. When x,,, = +1 we obtain: 


Stop = {XIX = (%1,%29---5X_1)} and Shy = {xlx = (44, %,...5%,, —D}, 


which denote the top and bottom faces of the cubical S$”. See Figure 1. Let 
p : R"*! — R” be defined by p(x) = (x,,...,x,), Le. p ignores the last coordi- 
nate. 


antipode —X 


Figure 1. Top, bottom, and equator of the cubical n-sphere (here n = 2). An example of antipodal 
points is indicated. 


For all x in S{,,, 
p(x) + f(—p)). 

Since p(—x) = —p(x), one may check that g(—x) = —g(x). Thus g is, so far, 
antipode-preserving. It is continuous, since f and p are. If g(x) = 0 then p(x) isa 
fixed point for f. 

Now we want to define g on the “side” faces of the cubical n-sphere so that it 
matches up continuously with g on Sj, and S}, and is still antipode-preserving, 
but is never zero on the sides. The latter is the tricky part. 

One might try to extend the values of g linearly from top to bottom, but this 
does not guarantee that g # 0 on the sides. However, the following lemmas show 
that if we define g suitably on the equator, we can linearly extend the values from 
the equator to top and bottom without creating a new zero for g. 


define g(x) = p(x) — f(p(x)). For all x in Sf, define g(x) = 


Lemma 1. Let F be a “side” face of the cubical S"”. That is, there exists some k, 
1 <k <n, such that for all x in F, x, is constant and equalto +1 or —1. Then for 
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all x in F 1 (Sty U Stop), the coordinate function g,(x) is either 0 or has the same 
SIQN AS X,. 


Proof: By the definition of g, on F 1 Si, g,(x) =x, — f,( p(x) and on F/O Shy, 
g(x) =x, + f,(—p(x)). Since f is a map to an n-cube, | f, |< 1. Hence, if x, = 1, 
then g,(x) = 0 for all x © FN (SZ, US?,). If x, = —1, then g,(x) < 0 for all 


top 
x E FN (Shy U Sippy). a 
Now letS7, denote the “equator” of S”, ie., {x © S” |x = (x,,..., x,,0)}. For 
all x © S27, define g on Sz, by 


X19+++y9X,, —1 Xy5+++5X,,1 
g(x) = p(x) + Steen BC etn) (1) 


and observe that g is now antipode-preserving on the equator. 


Lemma 2. For all x © S¢,, 


and has the same sign as x,,. 


if |x, |= 1, then the coordinate function g,(x) is not 0 


Proof: Lemma 1 shows that if |x,|= 1, g, has the same sign as x, on the top and 
bottom faces if it is not zero. Therefore, using (1) and p,(x) =x, = +1, we see 
that g,(x) is non-zero on the equator and has the same sign as x,. 


To define g on the equator we have “averaged” the values of the corresponding 


points on Si, and S},, and then “lifted” that average by p(x) (which equals x, in 


if x, is positive fi if x, is negative 


8; (x) 


Figure 2. Extending the values of g, linearly from the equator to top and bottom. Graphs show a 
cross-section of g, values along on “longitude” of the cubical n-sphere on the faces determined by 
Xx = +1, —_ 1. 


the k-th coordinate) to pull it away from possibly being zero. See Figure 2. 
We now define g continuously on the rest of S” by extending it linearly from 
the equator to the values on S},, and Sj. That is, for 0 <x,,, < 1, let 


top’ 
g(x) = Xn418(415-++> Xp, 1) + (1 ~Xn41)8(%1,--+,X p50). (2) 
For —1 <x,,, < J, let 
g(x) — Xing 1 BC X50 009 Xp, —1) + (1 + Xn41)8(%1,-- +5 X09). (3) 


Refer to Figure 2 again. Note that g is continuous and antipode-preserving. 
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Furthermore, it can achieve 0 only on Sj, or Sp, because of 


Lemma 3. /f |x,4,|< 1, then g(x) + 0. 


Proof: Since |x,,.,|< 1, we are on a side face and therefore there exists some k, 
1<k <n, for which x,= +1. We shall show that g(x) cannot be zero by 
showing that the coordinate function g,(x) is non-zero. 

Consider (2) and (3). By Lemmas 1 and 2, g,(x,,x5,...,%,,+ 1) and 
2,(X1, X5,.-.,X,,0) have the same sign as x,, and the latter is non-zero. More- 
over, (1 — x,,,) and (1 + x,,,) are strictly positive because |x, ,, |< 1. Equations 
(2) and (3) now imply that g,(x) is non-zero and, in fact, has the same sign as x,. 

| 


Now that g is defined everywhere on S$”, the Borsuk-Ulam Theorem implies 
that there exists a pair {x, —x} such that g(x) = g(—x). But g(x) = —g(—x), since 
g is antipode-preserving. Therefore g(x) = g(—x) = 0 which, by Lemma 3, implies 
that one of the pair {x, —x} is in Si,,. Without loss of generality, suppose it is x. 
Then g(x) = p(x) — f(p(x)) = 0 on Si, implies that for y = p(x) € B”, we have 
f(y) = y, which proves the Brouwer Fixed Point Theorem. 


ACKNOWLEDGMENTS. I thank Michael Starbird for many lively discussions and Courtney Coleman 
for helpful stylistic suggestions. 
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Does Mathematics Distinguish Certain 
Dimensions of Spaces? 


Zdzistaw Pogoda and Leszek M. Sokotowski 


translated by Abe Shenitzer with the editorial assistance of Hardy Grant 


PART I 


1. DIMENSION IN PHYSICS, OR: WHAT WORLD DO WE LIVE IN? The 
Greeks of, say, Euclid’s time talked of diastasis—extension—of bodies rather than 
of the number of their dimensions. 

It was Claudius Ptolemy who advanced the idea that the extension of a body 
should be measured by the number of mutually perpendicular straight lines (rigid 
rods) issuing from a point in the body and claimed that this number cannot exceed 
three. Ptolemy’s idea is a precursor of the idea of a system of coordinate axes used 
to this day in mathematics (outside topology) and in physics. His claim—we call it 
Ptolemy’s law—is ‘deeply rooted in Western culture and remained unchallenged 
until the third decade of the 20th century. The remarkableness of this fact 
becomes clear if we recall that the totality of the physics of the ancients, with the 
exception of the Pythagorean thesis of the sphericity of the Earth and of the 
celestial bodies, of Archimedes’ law, and of the elements of the theory of simple 
machines, was questioned and rejected at the beginning of the modern era. 
Ptolemy’s law was first challenged in 1919 when Theodor Kaluza, then an instruc- 
tor at Konigsberg, sent a letter to Einstein in which he advanced the hypothesis of 
a 5-dimensional universe. Kaluza did not base his hypothesis on any experimental 
facts. His thinking was guided by the notion that in a 5-dimensional universe he 
could unify gravitation and electromagnetism. 

In a sense, nothing has changed since 1919. Ptolemy’s law is supported by 
countless experiments in all areas of physics. Moreover, there is not a single 
empirical fact that would suggest even indirectly the existence of additional 
dimensions of space. At the present time, the only reason for questioning Ptolemy’s 
law is the hope that in a multidimensional universe all elementary interactions 
could be combined into one. Viewed by an observer in “our” 4-dimensional 
spacetime, this single interaction breaks down into gravitation, nuclear forces, and 
electroweak interactions. The only thing that has evolved between 1919 and today 
is the concrete version of the unification idea. First it was the classical 5-dimen- 
sional theory of Kaluza-Klein, then the multidimensional theory of Kaluza-Klein 
with arbitrary gauge fields, based largely on the conceptions of De Witt and 
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Witten, and, as of 1984, the hopes of the physicists have focussed on the theory of 
superstrings. 

These theories have in common the view that physical spacetime hasd =4+n 
dimensions of which four are directly observable and the remaining n are not—at 
least not with present experimental techniques. The two key questions to be 
answered here are: what is the value of n, 1.e., the value of the dimension d? and 
why does the d-dimensional universe contain a distinguished subspace of dimen- 
sion four? The second question is actually made up of the following two: why does 
the universe not have a structure analogous to that of Minkowskian spacetime, all 
of whose dimensions are equivalent in a well-known sense of the term, but 
decomposes into an “easy-to-see” component and a “difficult-to-see” component, 
and what is the significance of the dimensions of these components. 

We have no satisfactory answer to any of these questions. The classical Kaluza- 
Klein theory, in which d = 5, was given up long ago. There is also a modern 
version of this theory based on supergravity. In 1981 Edward Witten, working with 
this version, offered quite convincing arguments in favor of the value d = 11. But 
this theory had to be abandoned three years later because of insurmountable 
difficulties. In its place came the theory of superstrings, which engendered great 
hopes and strong emotions. Some important scientists hailed it as the General 
Theory of Everything. By 1987 this enthusiasm decreased somewhat. Yet the 
theory of superstrings remains the most promising candidate for a theory that bids 
fair to quantize gravitation and unify all interactions. This would lead to a 
coherent theory of all elementary particles. Still, the theory of superstrings is only 
a candidate for such a theory. Also, it is, so to say, very much nonunique, in that it 
admits a great variety of models. The trouble is that we do not know enough to 
choose the “true” one of these models. Furthermore, the theory is nonunique 
when it comes to the dimension of spacetime. At first it seemed that if one 
considers bosonic strings then the theory can be internally consistent only for 
d = 26, while the theory of fermionic and supersymmetric strings, of greater 
interest for physicists, requires the value d = 10. Later, however, it was shown that 
what had previously been regarded as consistency was actually just simplicity: both 
dimensions are singled out by the theory for “pure” strings. By associating a 
number of physical fields with a string one can achieve a consistent string theory 
for any value of d. Thus the theory again gives no hint regarding the dimension of 
the physical world. 

Suppose we accept d= 10 as a tentative answer to the first of our two 
questions. What about an answer to the second one? All we can say at this point is 
that we know neither the mechanism that brought into being the “easy-to-see” and 
the “difficult-to-see” components of the universe nor the factors that determine 
their respective dimensions. 

In spite of the complete absence of relevant experimental data, the idea of a 
multidimensional universe has become a fixed part of theoretical physics and can 
be expected to influence the direction of its development for a long time to come. 
This being so, it is reasonable to accept tentatively its validity. Even if the idea 
turns out to be false, the question it gave rise to, namely the question of why 
Ptolemy’s law holds, i.e., why d = 4, retains its relevance. In other words, a 
satisfactory future physical theory must not treat the dimension of physical 
spacetime as a free parameter whose value is determined experimentally but must 
be able to explain why its value is what it is. 
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2. DIMENSION IN MATHEMATICS: IS IT WEIGHTY? We have seen that, so 
far, physics provides no persuasive answer to the question of the dimension of 
physical space. Does mathematics? Does mathematics distinguish certain dimen- 
sions of space and are they close to 3, the only empirically distinguished dimen- 
sion? More generally, is dimension a significant mathematical concept? In other 
words, will a mathematician agree that the question of the number of dimensions 
of spacetime is important or will he tell us that the number of dimensions of, say, 
Euclidean space conveys information that is as trivial as that conveyed by the fact 
that we describe the surface of the earth with respect to a rectangular coordinate 
system oriented in the east-west and north-south directions? If this were the 
viewpoint of mathematics, then we would have to admit that the empirically 
determined dimension of spacetime is something “local” and devoid of deeper 
physical significance, something akin to the empirically determined difference 
between the horizontal and vertical directions, something due to the local gravita- 
tional field of the earth and without a universal character. This would also imply 
that every physical theory in which dimension plays an important role is false in a 
definite sense. To illustrate what we are trying to say note that Aristotle’s physics, 
with its division of the universe into sublunar and translunar regions, has no 
adequate mathematical apparatus; Euclidean geometry, known in antiquity, does 
not fit it in the least. We hope that these remarks make it clear that the 
determination of the status of the concept of dimension in mathematics is an issue 
of profound importance for physics. 

While the mathematical notion of dimension first came up in the context of 
Euclidean geometry, it is actually a topological concept applicable to quite a large 
class of topological spaces. Some basic problems of dimension theory were solved 
only in the first half of the 20th century and many others remain open. Some 
mathematicians think that problems associated with the notion of dimension are of 
the utmost importance not only in topology but also in all of mathematics. There is 
the famous dictum of Poincaré: “I think that the most important of the theorems 
of Analysis situs [an earlier name for topology that goes back to Leibniz and was 
very popular in the 19th century] is the one that asserts that space has three 
dimensions [1].” . 

We noted earlier that the roots of dimension theory go back to the Hellenistic 
period. In modern mathematics the concept of dimension was initially treated 
intuitively and nonuniformly, namely, there were distinct definitions of dimension 
in geometry and in linear algebra, as well as distinct definitions of the dimension of 
a simplex and of the dimension of a linear subspace. Nevertheless dimension was, 
and continues to be, one of the most important elements of the description of 
space and of geometric objects. The need for a nonintuitive, rigorous definition of 
dimension was highlighted by remarkable discoveries made at the turn of the 20th 
century. Thus in 1877 Georg Cantor showed that a segment and a square have the 
same cardinalities, i.e., that there exists a one-to-one mapping of a segment onto a 
square. This implied that dimension is not a set-theoretic concept. Again, a curve 
can be thought of as the result of a single stroke of a pencil. Hence Camille 
Jordan’s definition of a curve as a subset of the plane that is a continuous image of 
an interval. Intuition dictated the notion that such an object is 1-dimensional. But 
in 1890 Giuseppe Peano constructed a curve, 1e., a continuous image of an 
interval, that filled a square. These results showed that the dimension of a 
Euclidean space is not preserved either by one-to-one or by continuous mappings. 
There remained the open problem of whether a topological mapping, ie., a 
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one-to-one bicontinuous mapping, can map a plane onto 3-space. We should add 
that at that time the question of dimension in topology was primarily a question of 
its relation to Euclidean geometry. The properties of plane figures are so different 
from the properties of solids that if the dimension of a Euclidean space were not a 
topological invariant then topology would tell us very little about its geometric 
properties. The problem of dimension was made more acute by the discovery of 
curious figures such as the Sierpinski carpet and the Menger cube (see [2], [4]), 
whose dimensions cannot be guessed at a glance. It was obvious that the notion of 
dimension had to be reinvestigated “from the ground up.” 

It is probably safe to say that a breakthrough was Luitzen E.J. Brouwer’s proof, 
provided between 1911 and 1912, of the invariance of domain (see [2]). This 
implied that if nm # m then R” and R”™ are not homeomorphic. The significance of 
this result is obvious: since the intuitive notion of the dimension of a Euclidean 
space, given by the number of coordinates, turned out to be a topological invariant, 
it was reasonable to talk of dimension in topology. But the nature of the new 
invariant remained unclear because Brouwer did not use in his proof any topologi- 
cal characteristics of Euclidean dimension. 

During the next two decades, i.e., until the mid 1930s, mathematicians such as 
Henri Lebesgue, Karl Menger, Pavel Uryson, and Eduard Cech developed the 
foundations of a topological theory of dimension by defining dimension in a 
rigorous way and by proving many theorems that determined its properties; the 
theory applies to all classical cases (see [5]). Unfortunately, the theory is not 
unique in that there are three different definitions of the dimension of a topologi- 
cal space, namely the covering dimension (dim), the small inductive dimension 
(ind), and the large inductive dimension (Ind). These three notions coincide for 
metric spaces with a countable basis but diverge for more general spaces. 

Fortunately, the complications that arise in the general theory do not worry the 
physicists because the most general spaces used in physics are differentiable 
manifolds. In view of Whitney’s immersion theorem such a manifold can always be 
represented as a subset of R* for a large enough k. More precisely, if the manifold 
has dimension n, then it is a (regularly immersed) subset of R*”*! (see [2]). In the 
sequel we will limit ourselves to R” and its subsets. This means that we can use 
any one of the three definitions of dimension. These are rather complicated; the 
simplest and the most convenient for us is the definition of small inductive 
dimension. We present an abbreviated version of it. 


1. The dimension of the empty set is —1 Gnd@ = —1). 

2. If every point of a space X has an arbitrarily small neighborhood whose 
boundary has dimension <n — 1, then X has dimension <n (ind X < n). 

3. Ifind X <n but it is not true that ind X <n — 1, then X has dimension n 


(ind X = n). 
4. If ind X <n is false for all n, then the dimension of X is infinite Gnd 
X = &), 


If we look carefully at the above definition of dimension, then we see that it 
includes all the intuitive perceptions that we associate with this notion. If we 
remove from a line a point together with a small neighborhood of this point, then 
the boundary of that neighborhood consists of two points (if finitely many lines 
intersect at the point in question, then the corresponding boundary consists of 
finitely many points) and so has dimension 0. Similarly, if we remove from a plane 
(or, more generally, from a surface) a point together with a small neighborhood of 
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this point, then the boundary of that neighborhood consists of a closed line and so 
has dimension 1. This observation can be carried over by induction to any number 
of dimensions. 

The definition just given satisfies almost all of the requirements we are likely to 
associate with the notion of dimension. Its advantages are twofold. For one thing, 
the inductive definition of dimension of a set agrees with its “intuitive dimension.” 
For example, the inductive dimension of “n-dimensional” figures such as a cube, a 
sphere, a ball, and a simplex is n (this is difficult to prove!). For another, the 
definition of the dimension ind enables us to assign dimension numbers to an 
extensive class of topological spaces that baffle our intuition. Examples of such 
spaces are the Cantor set (dimension 0), the Sierpifski carpet (ind = 1), and the 
Menger cube (ind = 1) (see [3)). 

We used the words “almost all’ in connection with the definition of ind 
because, upon closer inspection, it turns out to have certain flaws (we won’t discuss 
them here) which led to the formulation of the two other definitions of dimension 
(chronologically, dim appeared in Lebesgue’s papers before the appearance of 
ind). The latter took the place of ind in more general topological spaces. 

To avoid possible misunderstandings, we wish to emphasize that, in spite of its 
name, the fractal Hausdorff dimension, now very fashionable in statistical physics, 
has nothing in common with the notion of topological dimension discussed in the 
present paper, for it is not a topological invariant. 

In the sequel, whenever discussing Euclidean spaces and their subsets, we will 
write dim X in place of ind X. 

It is clear that the definition of dimension does not single out any dimension 
(with the possible exception of the dimension of the empty set and the infinite 
dimension). This being so, it is natural to expect that the number of constructs will 
grow with the dimension of the space. A space of large dimension can be expected 
to have many relations and connections among its subsets, and the latter are likely 
to be marked by great complexity. In brief, we can expect that the greater the 
dimension, the greater the constructional possibilities. But one must also expect 
increased difficulties to be faced by the mathematician deprived of the assistance 
of intuition. On the whole, our expectations—and fears—are correct. This is 
confirmed by a host of theorems in geometry, in linear algebra, in the theory of 
manifolds, in the theory of differential equations on manifolds, and so on. 

But is it always the case that increased dimension implies increased complexity 
and an increased supply of structures? The most obvious counterexample is that of 
vector spaces. And there are other, rather weighty counterexamples to this asser- 
tion as well as problems in which dimension plays a very subtle role. The rest of 
this paper is an account of these counterexamples. Their detailed analysis will 
enable us to answer in a mathematically reasonable manner the question in the 
title of this paper: Does mathematics distinguish certain dimensions of spaces? 


3. CLASSIFICATION OF REGULAR POLYHEDRA. We begin with a problem 
which a mathematician is likely to dismiss as a mere curiosity, but which serves as 
the most elementary counterexample to the assertion that a space of large 
dimension is richer in structural possibilities than a space of small dimension. 
Specifically, we consider the classification of regular polyhedra and their higher- 
(and lower-) dimensional analogues in Euclidean spaces. 

A complete definition of a polyhedron is too complicated to be given here. For 
our purposes it suffices to define a polyhedron as the intersection of a finite 
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number of halfspaces. A regular polyhedron is a polyhedron whose faces are 
congruent polygons and one with the additional property that the number of faces 
at each vertex is the same. Informally, we could say that a regular polyhedron is a 
polyhedron which looks the same from every direction. The ancients knew that 
there are exactly five regular polyhedra (Platonic solids), namely, the tetrahedron, 
the hexahedron (the cube), the octahedron, the dodecahedron, and the icosahe- 
dron (see [6], [7]). 


Figure 1. Lower- and higher-dimensional analogues of a cube (on the left) and a tetrahedron (on the 
right). From the figure we can read off the procedure for forming higher-dimensional cubes from 
lower-dimensional ones: a square (a 2-dimensional cube) is obtained from two appropriately located 
segments (1-dimensional cubes) by appropriately joining their vertices (their endpoints), a cube is 
obtained from two appropriately located squares by appropriately joining vertices, a hypercube is 
obtained from two cubes, etc. The analogues of a tetrahedron—simplexes—are the smallest convex sets 
containing respectively three noncollinear points (a triangle), four noncoplanar points (a tetrahedron), 
five points not in the same space (a 4-dimensional simplex), etc. 


The role of regular polyhedra in the plane is played by regular polygons. The 
Greek mathematicians knew that their number is infinite. It is easy to define 
analogs of regular polyhedra in 4-dimensional space—the role of faces is played by 
ordinary 3-dimensional regular polyhedra. It turns out that there are just six such 
“regular cells” [6], [7]. What is surprising is that a Euclidean space of dimension 5 
or higher contains just three regular cells! One of them is the n-dimensional 
analogue of a cube and is called a hypercube. 

In order to describe the two remaining regular cells we introduce the notion of 
a p-dimensional face of an n-dimensional (not necessarily regular) polyhedron, 
p <n-— 1, which is a “face of a face.” For example, for n = 3 we have: p = 0 
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vertices, p = 1 edges, p = 2 ordinary faces of a polyhedron. It is possible to show 
that the number of p-dimensional faces of a hypercube is 2”? > . We also 


introduce the notion of a simplex, which will be needed later as well. We take a 
Euclidean space of sufficiently large dimension and in it n + 1 points that do not 
lie on an (n — 1)-dimensional hyperplane Gif n = 3 the four points are not 
coplanar). We now join the points of each pair by means of straight lines and 
obtain in this way an n-dimensional polyhedron whose vertices are the initial 
n + 1 points; the segments joining the vertices determine the p-dimensional faces 
of the polyhedron, 1 < p <n — 1. This polyhedron is called an n-dimensional 
simplex. For nm = 0 the simplex is a point, for n = 1 a segment, for n=2 a 
triangle, for n = 3 a tetrahedron, etc. Now the second regular cell for dimension 


n-+1 


n > 5 is the n-dimensional regular simplex; it has p+ p-dimensional faces. The 


third and last regular cell is the dual of the hypercube; its duality is reflected in the 
fact that the number of its p-dimensional faces is equal to the number of 
(n — p)-dimensional faces of the hypercube. 

It would seem that the number of regular cells should increase with increasing 
dimension. But this is not the case. The number in question has the fixed value 3 
for all n > 5. In this sense the dimensions 2, 3, and 4—and especially 2—are 
distinguished. 


4. CLASSIFICATION OF MANIFOLDS. Our primary reason for considering the 
classification of manifolds is not to supply another counterexample to the assertion 
stated earlier. Rather, it is to illustrate the singular situations that turn up in 
spaces of different dimensions. The simplicity and considerable generality of the 
notion of a manifold has made it a leading concept of modern mathematics. One 
of the key advantages of manifolds is that it is possible to give their global 
description using methods characteristic of Euclidean spaces. Specifically, the 
technique created for flat spaces can be used for “curved” objects. 

We will be mostly concerned with the notion of a topological manifold. An 
n-dimensional manifold is a Hausdorff space every point of which has a neighbor- 
hood U homeomorphic to all of R” or to one of its open subsets. The mapping ¢ 
that takes the neighborhood U to an open subset of R” is called a coordinate 
system, the pair (U, @) is called a chart, and the value o(p) = (x',..., x”) of g at 
a point p © U is referred to as the coordinates of p with respect to this chart. Let 
(U, ~) and (U, w) be two charts with the same domain but with different coordi- 
nate systems. Then the composition go ys’ maps a certain open set in R” onto 
another open set in R”. The definition of a topological manifold requires this 
composition to be a homeomorphism in R”. If, in addition, we require g ° w~* and 
its inverse to be infinitely differentiable, then we obtain the notion of a smooth 
manifold [2]. 

Physicists work most of the time with smooth manifolds, familiar from courses 
in analysis or in tensor calculus; a prototype of such a manifold is any smooth 
non-self-intersecting surface in Euclidean space. Here the term “smooth surface” 
is somewhat misleading, for “smoothness,” i.e., the existence of a tangent plane, is 
not necessary for a surface to be a smooth manifold (a similar remark applies to 
higher-dimensional manifolds). A close look at the definition shows that a surface 
as rich in edges as the surface of a cube is also a smooth manifold! Indeed, this 
surface is homeomorphic to a sphere, and every topological space homeomorphic 
to a smooth manifold can be given the structure of a smooth manifold. However, it 
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is convenient to stick to certain accepted images, and, when talking about differen- 
tiable manifolds, to have in mind objects such as an hyperboloid, a sphere, a torus, 
etc. 

Let us go back to topological manifolds. A natural and very important problem 
is that of their classification. This means giving a sequence of nonhomeomorphic 
manifolds of the same dimension such that every manifold of this dimension is 
homeomorphic to one of the manifolds in the sequence. In the case of dimension 1 
things are simple: every manifold is homeomorphic to a circle or to a straight line, 
so that the required sequence has just two elements. 2-dimensional manifolds, or 
surfaces, were classified at the beginning of the 20th century (in the case of 
dimensions 2 and higher one usually classifies only compact manifolds without a 
boundary, and these are the only manifolds we consider in this paper); here the 
sequence of representative manifolds is infinite. (More precisely, the sequence 
splits into two branches. The elements of one branch are connected sums of 
finitely many toruses, i.e., product spaces S! x S', and the elements of the other 
branch are connected sums of finitely many projective planes.) As for 3-dimen- 
sional manifolds, the situation is as yet unclear; we do not know if these are 
classifiable. In 1982 William Thurston gave a partial—and very incomplete—clas- 
sification of these manifolds (see [36] and the popular article [8]). We do not know 
what further progress is possible here. 


><} 
yb 


Figure 2. The sets in this figure are not manifolds. This is due to the presence of branch points or 
branch segments or to local changes of dimension. 


Matters are clear for topological manifolds of dimension 4 and higher: in 1958 
A.A. Markov proved that their classification is impossible [9], [10]. We will try to 
explain why this is so. Every smooth compact manifold can be triangulated, i.e., cut 
into simplexes (recall that simplexes of dimension 1, 2, and 3 are segments, 
triangles, and tetrahedra respectively). This implies the possibility of associating 
with a manifold a table listing its simplexes, their faces, and the ways in which they 
are connected; this table is a kind of code for the manifold. It is clear that such a 
code is not a unique “proof of identity” of the manifold it represents, for it is 
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possible to cut up a given manifold in different ways and thus to associate with it 
different codes. This means that we would need a universal algorithm for deciding 
whether or not two given codes represent the same manifold (of course, the same 
up to a homeomorphism). If such an algorithm existed, then we could use it to 
divide all the codes into equivalence classes, with codes in each class correspond- 
ing to the same manifold. Markov’s theorem is a nonexistence theorem: for 
4-dimensional (and for higher-dimensional) manifolds there is no algorithm that 
would permit us to decide if two given codes correspond to homeomorphic 
manifolds. Thus the question of classification of manifolds is doubly hopeless. Not 
only is there no method of classification, but even if somebody were inspired and 
produced a sequence of nonhomeomorphic manifolds that exhausted all the 
possibilities, this result would be useless because we have no effective (ie., 
universal, and requiring only a finite number of steps) way of checking which 
element of the sequence of representative manifolds a given manifold is homeo- 
morphic to. In other words, even if we had a classificatory sequence of manifolds, 
the classification of a concrete manifold would require the carrying out of infinitely 
many comparison operations between this manifold and the successive elements of 
the sequence. We note that even if it turned out that 3-manifolds (this is a 
standard abbreviation) are likewise not classifiable, this impossibility would be 
different from that for manifolds of dimension four and up. This is so because 
3-manifolds behave differently than their higher-dimensional “relatives.” 


Figure 3. A disk and a sphere are simply connected, for their fundamental groups are trivial (a closed 
loop on either of them can be shrunk to a point). A ring and a torus are not simply connected, but their 
fundamental groups are different. The fundamental group of a ring is isomorphic to Z and the 
fundamental group of the torus is isomorphic to Z ® Z. In the case of the ring the fundamental group is 
generated by closed paths that loop the hole and by their integral multiples. In the case of the torus we 
have meridians and parallels and their multiples. 


Wherein lies the difference? A detailed answer to this question would require 
the presentation of a number of facts from algebraic topology. This we cannot do 
here. What we can do is describe a certain deep difference between 3-manifolds 
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and 4-manifolds. The difference is that given a finitely generated group it is 
possible to construct a 4-manifold whose fundamental group is the given group, 
whereas for 3-manifolds this construction is in general impossible. More specifi- 
cally, it has been shown that there are groups that cannot be the fundamental 
group of any 3-manifold [9]. One such is the direct sum Z © Z ® Z @ Z of four 
copies of Z. 

We see that topological manifolds of lowest dimension behave in a “singular’’ 
manner, in the sense that they can be classified. In this respect the dimension 4 
plays the role of a limiting case. 

It might seem that classification is a manifestation of excessive pedantry on the 
part of topologists whose love of order compels them to try to systematize the 
abundance of manifolds. But recently this classification has become of intense 
interest to physicists. The reason is that the topology of manifolds determines the 
possible interactions of the elementary components of matter. We give some 
relevant facts. 

Until recently it was generally assumed that the elementary particles are 
pointlike. If so, then their history in spacetime is represented by worldlines. In 
classical accounts the interactions of particles are represented by intersecting or 
ramified worldlines. In quantum accounts they are represented graphically in a 
similar way by means of Feynman diagrams. A single line is a manifold, but an 
object in the form of an X or a Y is not, for a branch point, or a point of 
intersection, constitutes a singularity. This means that 1-dimensional topology, and 
in particular 1-dimensional manifolds, are useless for the description of interac- 
tions. The situation changes radically if the elementary objects are stringlike. 
Closed strings or loops, regarded as more interesting than open strings (segments), 
determine in spacetime worldtubes shaped like the surface of an asymmetric 
cylinder whose axis is a timelike line. We illustrate an interaction in which two 
strings collide and become one by means of a diagram that resembles a pair of 
trousers: two cylinders merging into one. The common feature of these and of 
more complex interactions is that the worldtubes of free and interacting strings are 
“proper” 2-manifolds without singular points. If a worldtube has the topology of a 
cylinder then the string is free. Every interaction results in a change in the 
topology. Topologically inequivalent worldtubes correspond to different interac- 
tions. This means that the topological classification of manifolds characterizes the 
possible Feynman diagrams of interacting strings (See [34)). 

This reasoning carries over to higher-dimensional elementary objects. 2-dimen- 
sional membranes describe in spacetime 3-dimensional “worldsolids,” and the 
difference between free and interacting membranes is reflected in the different 
topologies of their “solids.” It is generally thought that one should consider strings 
and ignore membranes and higher-dimensional objects. The arguments in favor of 
such a position are purely technical and are based on certain formal properties of 
mathematical strings; their physical sense is unclear. We recall that it is impossible 
to classify 4-manifolds, and that the problem of classification of 3-manifolds offers 
difficulties that have not been surmounted thus far. This implies the impossibility 
of classification of Feynman diagrams corresponding to dimension 4 and our 
present inability to classify the Feynman diagrams corresponding to dimension 3. 
Is it possible that these facts offer the first serious argument in favor of strings and 
against membranes and other objects? The thought that there may be so intimate a 
connection between topology and physics is truly fascinating. 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, Richard Pfiefer, Leonard Smiley, John Henry Steelman, 
Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before April 30, 1998; Additional information, such as generalizations and refer- 


ences, is welcome. The problem number and the solver’s name and address should 
appear on each solution. An acknowledgement will be sent only if a mailing label 
is provided, An asterisk (*) after the number of a problem or a part of a problem 
indicates that no solution 1s currently available. 


PROBLEMS 


10620. Proposed by James Propp, Massachusetts Institute of Technology, Cambridge, MA. 
A digraph on a vertex set V is a subset A C {(v, w) : v, w € V,uv 4 w} and is called 
strongly connected if it is possible to get from any vertex a to every other vertex e by a 
finite succession of arcs (a, b), (b, c),..., (d, e) in A. Forn => 1, let E, (respectively O,,) 
denote the number of strongly connected digraphs on the vertex set V = {1, 2,..., nm} with 
an even (respectively odd) number of arcs. Show that E, — O, = (n — 1)! forall n > 1. 


10621. Proposed by Harold G. Diamond and Bruce Reznick, University of Illinois, Urbana- 
Champaign, IL. Let F(x) denote Cantor’s singular function; that is, the unique non- 


decreasing function on [0,1] such that, if x = °°?) 2¢; /3/ with «; € {0,1}, then 


F(x) = 2%, 6;/2/. It is clear by symmetry that fy F(x) dx = 1/2. Prove that 


2 _ 3 _ 
| (F(x))° dx = 10 and i (F(x)) dx = 5" 


More generally, evaluate fo (F (x))”" dx for every positive integer n. 


10622. Proposed by M. N. Deshpande, Nagpur, India. Find infinitely many triples (a, b, c) 
of positive integers such that a, b,c are in arithmetic progression and such that ab + 1, 
be + 1, and ca + 1 are perfect squares. 


10623. Proposed by Roy Barbara, Lebanese University, Fanar, Lebanon. Let P = 
{1, 2, 3,...} and let | be the usual divisibility relation on P. For any S C P andn € P, 
St+tn={st+tn:s ES}. 

(a) Can one construct a subset S of P such that the poset (S, |) is isomorphic to (P, |), 
(S + 1, |) 1s isomorphic to (P, <), and (S + 2, |) is isomorphic to (P, |)? 

(b) For which integers n > 1 can one find a subset T of P such that (T, |), (J +2, |), and 
(P, |) are isomorphic posets? 
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10624. Proposed by William F- Trench, Trinity University, San Antonio TX. Suppose that 
ag > a; > an >--- and limyp+o a, = 0. Define 


‘© @) 
Sn = d-1ay = Gn — Anti +4n42— °°" 
j=n 
Show that }*a,S, < oo if and only if )* a? < 00. 


10625. Proposed by Olaf Krafft and Martin Schaefer, Technical University Aachen, Aachen, 
Germany. For x > 0 andn €N, define 


m= O(a) 1D (7) 


10626. Proposed by Florian Luca, Syracuse University, Syracuse, NY. For a positive integer 
k, the number of positive integers less than k that are relatively prime to k is denoted $(k). 
(a) Show that if m and n are relatively prime positive integers, then @(5” — 1) 4 5” — 1. 
(b)* Find all positive integers m, n such that @(5” — 1) = 5” — 1. 


Evaluate limp_+o9 an. 


SOLUTIONS 


Comparison Almost Everywhere 


10372 [1994, 274]. Proposed by Paul R. Chernoff and Jacob Feldman, University of Cali- 
fornia Berkeley, CA. Let (fn)S° be a sequence of non-negative integrable functions on the 
unit interval [0, 1]. Write fy fn(x) dx = cy and suppose that }> cy, < 00. 

(a) Suppose also that }) ./Cy < 00. Show that there is a convergent series of non-negative 
terms a, such that, for almost all x € [0, 1], f,(x) < ay for all sufficiently large n. 

(b) Show that the conclusion of (a) may fail if }° /cn = 00. 


Solution of (a) by the proposers. Set dyn = ./Cn. The set En, = {x: fn(x) > ay} has 
Lebesgue measure m(E,) < dy. Also, since )\ a, < 00, we have lim yon>n an = 0. 
—>0o _ 


lim m E,| = 
N—>oo (U 


Hence m(E) = 0, where E = (\y_; U,>y En. If x ¢ E then there is some integer 
N = N(x) such that x ¢ U,>y En; this means that x ¢ E, for alln > N,so fn(x) < Gn. 


Accordingly, 


Solution of (b) by Robert B. Israel, University of British Columbia, Vancouver, B. C., Canada. 
Suppose }°,, Cn < cobut >), ./Cn = 00. We construct a sequence of nonnegative integrable 


functions f, on [0, 1] with Io fn(x) = cy for which there are no terms a,, that satisfy the 
requirements. 

Let sp = i=! fej. Let w:R — [0, 1) be defined by x(x) = x mod 1, and let 
An = 1 ([Sn—1,5n]). Then m(An) = ./€n if cn < 1 (which is true for all but finitely many 
n), and every point of [0, 1) is in infinitely many A,. Take fn(x) = /cn for x € A, and 
fn(x) = 0 otherwise. Suppose (a,) is a nonnegative sequence with }\ a, < oo. Let S = 
{n: ay </fen} and T = {n:a,> fn}. We have cp m(An) < ner an < 0. 
Therefore, almost every x in [0, 1) is in only finitely many A, with n € T. This means that 
for almost every x € [0, 1) there are infinitely many n € S for which x € Aj, ie., it is not 
true that f,(x) < ay for all sufficiently large n. 
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Editorial comment. It should have been said more clearly that the desired property is: for 
almost all x € [0, 1] there is a positive integer N depending on x such that f,(x) < ay for 
alln > N. 

Kent Merryfield solved the more general problem: Suppose p,q € [1,0oo) ands = 


1/ 
pq/(p + q). Let (fn) be a sequence of nonnegative functions with ( fo fi ) —_ Cn: 


If }) c* < oo, then there is a sequence (a,) with )> af < oo such that for almost all x, 
fn(x) < ap for all sufficiently large n. Furthermore, this may fail whenever ))c*, = oo. 
The original problem corresponds to p = g = lands = 1/2. 


Solved also by F. Galvin & P. J. Szeptycki, R. Holzsager, T. Jager, J. H. Lindsey II, O. P. Lossers (The Netherlands), 
L. E. Mattics, K. G. Merryfield, K. Schilling, R. B. Tucker, and J. Vinson. 


Another Perspective on Napoleon’s Theorem 


10415 [1994, 912]. Proposed by Edward Kitchen, Santa Monica, CA. Let A be a triangle 
whose centroid is at the origin. Choose k € R, k > 1, and dilate one of the Napoleon 
triangles of A by a factor of —k and the other by a factor of k/(1 — k). Prove that A is 
(simultaneously) perspective with both dilated triangles. 


Solution by J. C. Binz, University of Bern, Bern, Switzerland. Let the triangle be given in 
the complex plane by the numbers 2a), 2a2, 2a3 (counterclockwise) with a; + a2 +a3 = 0, 
and let d = i./3/6. Then the vertices of the dilated Napoleon triangles are 


bj = —k(ai41 + ai42 + d(ai41 — ai42)) = k( a; — d(ai41 — aj42)) 


and 
ci = (-k/(1 — k)) (a; + d(ai41 — a;42)), 


where a4 = a; and as = a2. Now, bj — 2a; = (k — 2)a; — kd(aj41 — aj42) = ej; and 
c; —2a; = ( 1/( —k) )ej. Since (bj — 2a;)/(cj —2a;) = (1—k) € R, the triples 2a;, b;, c; 
are collinear. Let G; denote the corresponding lines. The complex equations of the lines 
G; are 

Im((z — 2aj)e; ) = Im( ze; + 2kdaj(ai41 — ai42)) = 0, 


where Im(z) stands for the imaginary part of z. The common point p of G; and G2 lies on the 
line H with the equation Im( (z — 2a;)e )+Im( (z — 2a2)e2 ) = 0. Since e} +e2+e3 = 0, 
this equation can be written 


Im(—ze3 + 2kd(aya2 — ajay + 43a — a3a})) = 0. 


Since Im(d(a1a2 — aja) ) = Oand Im(da3(a2 —a}))= —Im(da3(a; — a) ), we obtain 
finally the equation 
Im(ze3 + 2kda3(a; — az)) =0 


for H. Therefore H is G3 and the three lines G;, G2, and G3 are concurrent. 


Editorial comment. The solver also noted that the proof could be generalized to allow the 
Napoleon triangles to be replaced by triangles formed by the vertices of any set of similar 
isosceles triangles built on the sides of the triangle. This construction leads to the Kiepert 
hyperbola (see R. H. Eddy and R. Fritsch, The conics of Ludwig Kiepert: A comprehensive 
lesson in the geometry of the triangle, Math. Mag. 67 (1994) 188-205). If the hyperbola is 
parameterized by the base angle ¢ of the isosceles triangle, then dilation by a factor A leads 
to the point on the hyperbola whose parameter 0 satisfies tan@/tan@ = 3A/(2 +A). 


Solved also by J. Anglesio (France), M. Benedicty, J. Ferrer (Spain), J. Fukuta (Japan), C. G. Petalas & T. P. Vidalis 
(Greece), R. Tauraso (Italy), NSA Problems Group, and the proposer. 
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A Paucity of Diophantine Solutions 


10445 [1995, 359]. Proposed by Alan J. Gross, Medical University of South Carolina, 
Charleston, SC, and Hong Zhang, Indiana-Purdue University, Fort Wayne, IN. Note that 
5*45+2 = 2°. Are there any other positive integers a and b with a? +a +b = b®? 


Solution by the late J. Sutherland Frame, Michigan State University, East Lansing, MI. No. 
Setting b = 1 forces a = 0, which is not positive. Setting b = 2 yields only a = 5, since 
27 > (a+1) fora > 6. Hence we may assume that b > 3. Itis also easy to show that there 
are no solutions with a = 1 or a = 2. The inequality a? < b% implies that a!/4 < p}/®, 
Since x !/* decreases with increasing x when x > e, we obtain a > b > 3. 

Let f(x) =x? +x+b-—b*. We have f(b) = 2b > 0. We claim that f(b + 1) < 0 
and that f is decreasing for x > b > 3, so its only root above 3 lies between b and b+ 1 
and cannot be an integer. To prove that f(b + 1) < 0, observe that for b > 3, 


b’+! = eb? + (b—e)b? > (14+1/b)?b? + 280? > (64-199 + (2D 41). 


For x > b > 3, we have observed that x” < b*. Thus 
/ b-1 x x bx? —Xx x 1 
f (x) = bx +1—(nb)b* = b*| -— +) % -—Inb] < b* (1+ — —1n3) < 0. 
x b*® 27 


Solved also by J. Anglesio (France), R. Barbara (Lebanon), D. Beckwith, M. R. Burke & L. Sweet (Canada), R. J. 
Chapman (U. K.), J. Christopher, R. B. Eggleton, K. Foltz & N. Rosenberg, Z. Franco, S. M. GagolaJr., R. A. Groeneveld, 
G. A. Heuer, J. Kholdi, N. Komanda, W. C. Lang, J. H. Lindsey II, J. H. van Lint (The Netherlands), L. E. Mattics, C. A. 
Minh, E. D. Onstott, A. Pedersen (Denmark), R. E. Prather, R. Stong, A. A. Tarabay (Lebanon), A. N. ’t Woord (The 
Netherlands), W. C. Wu (China), C. Y. Yildirim (Turkey), Anchorage Math Solutions Group, NSA Problems Group, 
Oklahoma State Problems Group, and the proposers. 


Conjugate Generators 


10449 (1995, 360]. Proposed by Frank Schmidt, Arlington, VA. For which n can the 
symmetric group S, be generated by two conjugate permutations? 


Solution by Fred Galvin, University of Kansas, Lawrence, KS. This holds for all n. We 
prove the stronger statement that if a group G is generated by elements a and b such that 
a* = b*"+! = | for some integer m, then G is generated by the two conjugate elements 
f =aband g = afa. The proof is thata = f(gf)” and b = (gf)"t!. 

For n > | it is well known that S, is generated by the transposition a = (12) and the 
cycle c = (12...n). It is also generated by a and the (nm — 1)-cycle ac. Thus in applying 
the result in the previous paragraph we take b = c if n is odd and b = ac if n is even. 


Editorial comment. The work of most solvers led to a generator that was a cycle of maximal 
even length. For n > 5, John H. Lindsey II gave an example that was a product of a 
transposition and a maximal disjoint cycle of odd length. Stephen M. Gagola Jr. asked 
whether examples could be found if the cycle structure was subject to additional restrictions. 
Two necessary conditions are (1) there are an odd number of even cycles, and (2) there are 
fewer than n/2 fixed points. These conditions have been important in general work on 
generators of the symmetric group (see section 53 of Daniel Gorenstein (editor), Reviews 
on Finite Groups, AMS, 1974). In addition, Gagola mentioned (3) the element is not of 
order 2 if n > 3, which is necessary because two elements of order 2 generate a dihedral 


group. 


Solved also by M. Brodie, R. J. Chapman (U. K.), S. M. Gagola Jr., J. H. Lindsey II, G. R. Robinson (England), R. Stong, 
D. B. Tyler, P. Venzke, A. N. ’t Woord (The Netherlands), Anchorage Math Solutions Group, NSA Problems Group, 
Oklahoma State Problems Group, WMC Problems group, and the proposer. 
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A Surprising Family of Continued Fractions 


10457 [1995, 464]. Proposed by Henry Cohn, Massachusetts Institute of Technology, Cam- 
bridge, MA. Determine the simple continued fraction of (F10,41/Fion)>, where F; denotes 
the kth Fibonacci number. 


Solution by B. M. M. de Weger, Krimpen aan den IJssel, The Netherlands. The answer is 


F> FE —6 
On =(11,...,11, 10, 1,1, 2, 1,17, 11,..., 11,10, 1,4, 11,...,11, 
10n 


where the strings of 11’s have lengths 2n — 1, 2n — 2, and 2n — 1, respectively. Note that 
indeed (F29n4+1 — 6)/5 € Z. We prove this by translating continued fraction expansions 
into matrix products and using the following lemma. 

Lemma. If (p/q) > 1 is a rational number in lowest terms, and ao, ..., @, are positive 
integers with a, > 2, then (p/q) = [ao,..., @,] if and only if integers c, d exist such that 


p c\_ {a 1 a, 1 an 1 
(? “)=(4 0 (4 o) (4 9). 
Proof. For0 < m <n, let (Pm/qm) in lowest terms be defined by (Pm/Gm) = [Qm,.--, Gn]. 
Also let 
rm Cm \ _(Qm 1 An+1 1 Aan 1 
(‘ m)=(4 0) ( 1 (4 5): 


Then Pn = ln = Qn and gyn = Syn = 1. By 


1 rm C Am 1 r Cc 
veaes _ d m m _ m m+1 m+1 
[Qn An | am + [amal-.-.4n] —_ Gn] an (‘r 7") ( 1 >) Ge din+1 ’ 


we have the same recurrence relations for (Pm, Gm) aS for (1m, Sm). oO 
In view of the lemma, it suffices to show for some integers c, d that 


Fiongr ©) _ (ML 1\"' (10 1) (1 1)? (eS 1) (1 1 
Foon d 1 O 1 O 1 O 1 0 1 O 
(17 1) (11 1)" (10 1) (1 1) (4 1) (12 1)" 
1 O 1 O 1 O 1 O 1 O 1 O 
In expanding the matrix product, we use the identities Fo9,4; = F an +F an 4p 10n—5 = 
—8Fion + 5Fion+1: Fion—10 = 89F ion — 55Fion+i1, Fion—15 = —987F ion + 610F 0n+1, 


and \ 
11 1\ 1 Pspa5 Fox 
1 O 5\ Psp Fsxes ) 
The needed matrix identity now reduces to two identities involving Fjo, and Fjon41: 

5 5 4 3 2 2 3 4 5 
SFionst = Pion + 2 Fon P10nt1 + 2F jon Fiong + 4Pton Pion + £100 F ton41 + 2F i0n+1 
+Fion + FeonFiont1 + 2Fi0nFionet + 3¥F ions 

5 4 3 2 2 3 4 5 
5 Fin = 2Fion — FionFiont1 + 4FionFion41 — 2Fion¥ ont + 2Fi0nF ton41 — Fion+1 


—3Fion + 2FionF10n+1 — FidnFions1 + Fion+1: 


k k 
These can be proved by inserting Fy = +z( (44,5) — (454) ) and expanding the 


products. 


Solved also by J. Anglesio (France), R. J. Chapman (U. K.), and the proposer. 
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Solitaire on the Circle 


10459 [1995, 553]. Proposed by David Beckwith, Sag Harbor, NY. A game is played with 
n disks (n > 3), each having a black face and a red face. Initially, the n disks are arranged 
in a circle showing a random pattern of black and red faces. A move consists of taking away 
a black disk (i.e., one with its black face exposed) and inverting its neighbors (if any). The 
resulting gap is not closed up, so the remaining disks do not acquire new neighbors. The 
goal is to remove all the disks. For which initial patterns is this possible? 


Solution I by Glenn G. Chappell, Southeast Missouri State University, Cape Girardeau, 
MO, and Chris Hartman, University of Alaska, Fairbanks, AK. It is possible if and only 
if the inital pattern has a nonzero even number of black disks. Note that the first move 
(including the flips) changes the parity of the number of black disks. Hence it suffices to 
prove that contiguous disks with a gap at each end can all be removed if and only if the line 
has an odd number of black disks or is empty. 

The statement is trivial for the empty string; we proceed by induction on n. Suppose 
first that the number of black disks is odd. We may choose a black disk D having an even 
number (possibly zero) of black disks to each side. Removing D creates two (possibly 
empty) shorter lines of disks. If either side is nonempty, completing the move by flipping 
the neighbor of D creates a line on that side with an odd number of black disks. By the 
induction hypothesis, the remaining disks on both sides can be removed. 

Now suppose that the nonempty line has an even number of black disks. If it has no black 
disks, then nothing can be removed. Otherwise, each black disk D has an odd number of 
black disks to one of its sides. Removing D and flipping its neighbor on that side creates a 
smaller line with an even number of black disks which, by the induction hypothesis, cannot 
be removed. Thus there is no first disk to remove that permits success. 


Solution II by O. P. Lossers, University of Technology, Eindhoven, The Netherlands. To 
prove the same result, we label disks with the integers modulo n. When there is a successful 
removal procedure, we define a bijection 7: Z, — Z, by letting 7(i) = j when disk i is 
removed at stage j. A disk initially red has been inverted once when it is removed; a disk 
initially black has been inverted zero or two times. Thus an original black disk is removed 
before both neighbors or after both neighbors, while an original red disk is removed between 
its neighbors. Thus the periodic extension of i (7(n) = 20), etc.) has a local extremum 
at i precisely when i is initially black. These local extrema are alternately maxima and 
minima, so there are an even number of them. 

For sufficiency, we construct such a bijection when the number of black disks is nonzero 
and even. We begin by giving the black disks distinct real labels that are alternately positive 
and negative as we traverse the circle. Between a pair of black disks, we assign labels to the 
red disks so that all labels are distinct and the labels are monotone on this interval. Finally, 
we assign j to the disk that has received the jth smallest number. 


Editorial comment. Randall Maddox of Pepperdine University used Multimedia Toolbook 
to create a version of the game that runs in Microsoft Windows 3.1.° He suggested that 
playing the game in this environment would be fun while leading to the discovery of the 
solution. He has offered to make this application available to anyone who wants it. 


Solved also by S. F Barger, M. Benedicty, W. C. Calhoun, D. Callan, M. Cerne (Slovenia), R. J. Chapman (U. K.), 
Y. S. Chen, S. M. Gagola Jr., W. Gasarch & P. Godfrey, R. Holzsager, R. D. Hurwitz & S. Hurwitz, L. Keener & 
S. Walters (Canada), P. G. Kirmser, C. Lindsey, J. H. Lindsey II, J. H. van Lint (The Netherlands), H. Loimer (Austria), 
R. B. Maddox, R. Martin (Germany), M. D. Meyerson, B. Mixon, J. H. Nieto (Venezuela), A. Onshuus (Colombia), 
A. Pedersen (Denmark), K. Rebman, G. Rice, M. Rierson, A. J. Schwenk, K. Sonnichsen, R. Stong, J. R. Stoughton, 
S. Szabo (Hungary), A. A. Tarabay & B. B. Ghalayini (Lebanon), Y. Wang, M. Woltermann, A. N. ’t Woord (The 
Netherlands), Anchorage Math Solutions Group, Con Amore Problem Group (Denmark), NSA Problems Group, Prague 
Problem Solution Group (Czech Republic), Circulo de Matématicas — Universidad de Los Andes (Colombia), and 
the proposer. 
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A Demanding Currency 


10465 (1995, 554]. Proposed by Paul K. Stockmeyer, College of William and Mary, Williams- 
burg, VA. As the Minister of Finance of a newly independent country, you must design a 
new currency: a Sequence d < d2 < d3 < --- of positive integers with d; = 1, to be the 
denominations of various coins and bills. Although you are authorized to create an infinite 
number of denominations, the legislature has passed some laws restricting your choices. 
Rule 1: There must be a bound b on the number of items needed for any payment. 
Rule 2: The “denomination density”, limyz_,.,. k/d,, must be zero. 
Rule 3: Repeatedly choosing the largest denomination less than or equal to the amount 
remaining to be paid (the greedy algorithm) always leads to the use of the minimal number 
of items to pay any amount. 

Can you design a currency meeting these rules? 


Solution by Ellen Hertz, National Highway Traffic Safety Administration, Washington, DC. 
There is no such currency. Rule 2 implies that {d,4| — d,} is unbounded. By Rule 1, we 
can select an amount n that requires the maximum number, b, of items. Select an index k 
such that dy4, — dy > n. Consider the amount dj +n. By Rule 3, we can start with d; to 
obtain the minimum number of items needed to pay dy +n. However, doing so leaves the 
amount n remaining, which requires b additional items. 


Editorial comment. Several readers observed that any two of the rules can easily be satisfied. 
The set of all integers satisfies rules 1 and 3, and a geometric progression satisfies rules 2 and 
3. An example satisfying rule 1 with b = 4 and rule 2 is given by dy = k?. One can satisfy 
rule 1 with b = 2 and rule 2 by having dx list in increasing order the set of numbers whose 
ternary expansion omits the digit 2. Furthermore, while rule 3 with the usual interpretation 
fails for this example, if one applies the greedy algorithm separately to each ternary digit of 
the amount to be paid, one obtains a number in the set for which the balance is also in the 
Set. 


Solved also by S. F. Barger, W. C. Calhoun, G. G. Chappell & C. M. Hartman, F. M. Djorup, P. Freyd, S. M. Gagola Jr., 
T. Hwa, L. Keener (Canada), J. H. Lindsey II, O. P. Lossers (The Netherlands), R. Martin (Germany), C. H. Montenegro 
(Colombia), G. Myerson (Australia), A. J. Schwenk, D.C. Terr, Y. Wang, D. R. Witte, G. J. Woeginger (The Netherlands), 
NSA Problems Group, Prague Problems Group (Czech Republic), and the proposer. 


The Number of Positive Semidefinite 0,1-Matrices 


10481 [1995, 840]. Proposed by Frank Schmidt, Arlington, VA. Let f (n) denote the number 
of positive semidefinite n by n matrices whose entries are 0 or 1. Let g(n) denote the number 
of positive definite n by n matrices whose entries are 0 or 1. Evaluate f(n) and g(n). 


Solution by David Callan, University of Wisconsin, Madison, WI. The answers are g(n) = 1 
and f(n) = B(n + 1), where B(n) is the number of partitions of an n-element set (the nth 
Bell number). The definitions permit only symmetric matrices. The determinant criterion 
for areal symmetric matrix to be positive definite [positive semidefinite] is that its principal 
submatrices all have positive [nonnegative] determinant. Thus a positive definite 0, 1-matrix 
has 1’s on the diagonal (consider the 1 by 1 submatrices) and 0’s off the diagonal (consider 
the 2 by 2 submatrices). Hence g(n) = 1. 

Every square matrix of 1’s is positive semidefinite; its 1 by 1 principal submatrices have 
determinant 1 and the others have determinant 0. Thus A is positive semidefinite if there is 
a permutation matrix P such that P—! AP is a direct sum of a zero matrix and matrices of 
all 1’s. Such a matrix is specified by partitioning {0, 1, ..., }; the nonzero elements in the 
block containing 0 are the indices of the rows and columns in the 0 matrix, and each other 
block gives the rows and columns in a block of 1’s. In this way we obtain B(n + 1) such 
matrices. 
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We claim that these are the only positive semidefinite 0,1-matrices. Given a positive 
semidefinite 0,1-matrix A, let J = {i : aj; = 1}. Fori ¢ J, the 2 by 2 submatrices involving 
aj; = 0 show that all entries in the ith row and ith column are 0. Define a relation ~ on J 
by setting i ~ j if and only if aj; = 1. Reflexivity and symmetry are immediate for ~. If 
~ is not transitive, then there exist 7, j,k € J such that a;; = aj, = 1 but aj, = 0. This 
yields a 3 by 3 principal submatrix B that is all 1’s except for one off-diagonal pair of 0’s. 
Since the determinant of such a matrix is —1, this cannot happen. We conclude that ~ is an 
equivalence relation, which expresses A in the form described above. 


Editorial comment. The characterization of positive semidefinite 0,1-matrices appears in 
R. A. Horn, The theory of infinitely divisible matrices and kernels, Trans. Amer. Math. Soc. 
136 (1969) 269-286 and also as exercises 4 and 5 on page 457 of R. A. Horn & C. Johnson, 
Topics in Matrix Analysis, Cambridge University Press, 1991. 


Solved also by S. Byrd & R. L. Smith, O. Krafft (Germany), S.-O. Troschke (Germany), and the proposer. 


The Exponent Isn’t Variable 


10508 [1996, 266]. Proposed by Jests Ferrer, Universidad de Valéncia, Burjasot, Spain. 
Let f, g : R — R be two infinitely differentiable functions with g analytic. Show that if, 
for each point x € R, there is a positive integer (x) such that f(x) = g(x)#@), then f is 
a constant power of g, 1.e., there is a fixed positive integer n such that f(x) = g(x)” for all 
xeER. 


Solution by Nicholas Passell and Alexander Smith, University of Wisconsin-Eau Claire, 
Eau Claire, WI. If g is constant the problem is trivial. Otherwise by analyticity of g the 
set T = {x : g(x) = 0,1, or — 1} consists of isolated points with no accumulation point. 
Except at the points of T, the function w(x) = (In| f(x)]) / (In |g(x)|) 1s continuous. 
Hence j4(x) is constant between any two of the isolated points of T. Suppose t € T and 
that f(x) = g(x)" on [a;,¢t] and f(x) = g(x)” on [t, a2]. Since f(x) is infinitely 
differentiable, its left and right derivatives of all orders at t are equal. Thus g(x)”! and 
g(x)"2 are analytic and their derivatives of all orders at t agree. Hence g(x)”"! = g(x)” 
for all x. Looking at any point x ¢ T we see n} = n2. Thus the exponent may be chosen 
constant throughout R. 

Solved also by R. J. Chapman (U. K.), J. Cobb, J. Hejcman (Czech Republic), S. S. Kim (Korea), C. A. Kumar & 


P. V. S. P. Saradhi (India), J. H. Lindsey II, L. E. Mattics, M. McKinzie, A. Pedersen (Denmark), R. Stong, T. V. Trif 
(Romania), GCHQ Problems Group (U. K.), NSA Problems Group, and the proposer. 


Random Distribution on the Sphere 


10516 [1996, 347]. Proposed by DonaldA. Darling. Let (X, Y, Z) be three random variables 
such that aX + BY + yZ is uniformly distributed in the interval [—1, 1] for every set of 
three direction cosines, i.e., numbers with a2 + pb + y2 = |, Show that X*4+-¥2+Z7 =1 
with probability one. | 


Solution I by Richard A. Groeneveld and Stephen B. Vardeman, Iowa State University, 

Ames, IA. Let V = X?+ Y2 + Z*. We show that E(V) = 1 and Var (V) = 0, so V = 1 

with probability 1. Using a = 1, 6 = 1, and y = 1 successively, each of X, Y, and Z 

has the distribution of U, where the latter is a uniform random variable on [—1, 1]. Thus 

E(X*) = E(Y*) = E(Z*) = E(U?) = 1/3, and so E(V) = E(X?4+ Y2+ Z?) =1. 
Also (X + Y)/+/2 and (X — Y)//2 both have the distribution of U, so 


E ((X + Y)*) — E((X — Y)*) 


z = 2E(X?7Y + XY%) =0. 
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One has E(X*) = E(Y*) = 1/5, and 
E(X+Y)*)  E(X*+4X3¥ +6X?¥?4+4xY3 +74) | 
4 7 4 7 
Using E(X?Y + XY*) = O yields E(X*Y*) = 1/15, and similarly, E(X?Z2) = E(Y?Z7) 
= 1/15. Hence, 


| — 


Var(V) =E ((x? 424 z*)*) _| 
= E(X*+ 442754 2x7y? + 2xX°Z? 4 2¥?Z?) -1 


=())+4(4)-100 


which completes the proof. 
Examination of the proof shows that in order that X? + Y? + Z? = 1 with probability 
1, it is sufficient that aX + BY + yZ have the distribution of U for the 9 sets of di- 


rection cosines (1, 0, 0), (0, 1,0), (0, 0, 1), (1 /J2, +1/V2, 0), (1 /J2, 0, +1 /V/2), and 
(0, 1/./2, +1 /-/2). 


Solution II by Robin J. Chapman, University of Exeter, Exeter, U. K. We use the fact that 
the probability distribution of a triple of random variables (X, Y, Z) is determined by its 
characteristic function x(u,v,w) = E(exp(i(uX + vY + wZ))). But if (u,v, w) = 
r(a, B, vy) with a? + 6% + y* = 1, then 


1p), 
x (u,v, w) = E(exp(ir(axX + BY +yZ))= >| edt 
—1 
_ sinr sin /u2 + v2 + w2 


r Ju? + v* + w2 
Now let (X1, Y;, Z;) be a triple of random variables uniformly distributed according to area 
on the unit sphere. If x; is its characteristic function, then x; is spherically symmetric. 
Calculation using spherical coordinates gives 


1 xn pre : 
4nr Jo Jo Ww 


By spherical symmetry, x; = x and so (X, Y, Z) and (X}, Y;, Z;) have the same distribu- 
tion. Hence X* + Y? + Z? = 1 with probability one. 

Similar arguments work for other numbers of variables. If (X, Y) are random variables 
with aX + BY uniform on [—1, 1] for all a and 6 with a? + B* = 1, then the distribution 
of (X, Y) is uniquely determined, and the joint probability density function of (X, Y) is 
(1/2m)(1 — x* — y*)—!/2, But ifn > 4 and (X),...,X,,) are random variables with 

n 


n 
da = 1, then X¢+XZ+XZ = XZ+XZ+XZ = 1 


a; X; uniform on [—1, 1] whenever 


j 
with probability one. Hence X? = x3 with probability one, but this cannot occur since 
(X,, X2, X3) is uniformly distributed over the unit sphere. It follows that this result cannot 


be extended to more than three variables. 


Editorial comment. Using the methods of Solution I, Mark Pinsky showed that for the 
analogous n-dimensional problem, Var (X 2 Se Xx?) = (n— 3)(n+5)/15; so forn > 3 
the norm cannot be almost surely constant. Solution II establishes the stronger conclusion 
that (X, Y, Z) itself is uniformly distributed over the surface of the sphere. (The truth of the 
converse was noted by several solvers.) Can one find (X, Y, Z) that satisfies the hypothesis 
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for the nine directions specified in Solution I but is not uniform on the sphere? The reference 
A. Renyi, On projections of probability distributions, Acta Math. Sci. Hungar. 3 (1952) 131- 
142 was noted by Groeneveld and Vardeman; there is also the paper by our proposer: D. 
A. Darling, On a problem of Renyi, Period. Math. Hungar. 3 (1973) 5-7. Renyi proved a 
more general result: A distribution in R? is determined by its projections on every straight 
line through the origin. 


Solved also by P. J. Fitzsimmons, E. Hertz, J. H. Lindsey II, L. E. Mattics, K. G. Merryfield, W. A. Newcomb, 
S. Northshield, N. Passell, M. A. Pinsky, K. Schilling, R. Stong, GCHQ Problems Group (U. K.), and the proposer. 


An Excluded Circle 


10517 [1996, 347]. Proposed by Jean Anglesio, Garches, France. Let AABC be atriangle 


and let H be its orthocenter and J its incenter. If W is the point such that HW = 4 HI and 
R = 2/2 | HI|, show that none of the vertices A, B, or C is in the interior of the circle 
with center W and radius R. 


Solution by Joel Rosenberg, University of Michigan, Ann Arbor, MI. As usual, let us write 
a, b, and c for the lengths of sides BC, CA, and AB, respectively. Let O be the circumcenter 
of AABC, and let p be its circumradius. We analyze the problem by using vectors centered 
at O and writing P for the vector OP. We also write (a, 6, y) for~aA + BB+ yC. 
Observe that (B — C)-(B+C) = p* — p* = 0, and cyclically, so we conclude that 
H=A+8B+C =(1,1, 1). Also, for any P, 


Area(APBC) Area(APCA) Area(APAB) 
Area(AABC)’ Area(AABC)’ Area(AABC) }’ 


so in particular J = (a/(a+b+c),b/(a+b+c),c/(a+b-+c)) and 


Ww a=41—3H = (2730736 T3a tb ~ 3¢ 3a ~ 3b Fe 
a+tb+c at+tb-+c at+bt+e 


The definition of circumcenter gives A-A=B-B=C-C= p”, and the law of cosines 
gives A-B = p?—c?/2, B-C = p*—a*/2,andC-A = p* —b?/2. Thusif P = (u, v, w) 
and Q = (x, y, Z), then 


p? p* —(c*/2) p> —(b7/2)\_ x 
P-Q=(u v w){ 0?-(c?/2) p* p? — (a*/2) (>) 
p* —(b?/2)  p* — (a?/2) p? z 


We conclude that 


4p*(a+b+c)—a? —b? —c3 — abc 


WF |2 _ _ _ 
JH7|° = (1 —H)-U—- #8) qabde 


b] 


\WA|? =(A—W)-(A-W) 
_ 36p7(a +b +c) — 9a? — 12b° — 12c3 + 4b*c + 4bc? — 16abc 
7 atbt+e 


Now we write K for Area(AABC), and recall the law of sines K = abc/4p and Heron’s 
formula 16K? = (a+b+c)(a+b—c)(c+a—b)(b+c —a). Thus we can write 


—> 2 £xf(a,b,c) 


|W A _— g(a, b,c) 


> 12 
and |H7| = 16k2” 
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where f and g are sixth degree homogeneous polynomials in a, b, and c. Using Mathemat- 
ica, we find that 


TF 2 ote 2 _ £144, 0) — 88, b,c) 
|WA|" —8]H1|" = ———— 


(a3 — 2ab — 2a2c — ab? + 4abe — ac? + 2b? — 2b%e — Ibe? + 2c3)? 
5 ii 2 


Thus |WA| > 2/2|H7|, and similarly |WB| > 2/2 |HT7|, |WC| > 2/2|H7|, so none of 
the points A, B, C lies within the circle centered at W with radius 2./2 \7|. 

Editorial comment. All four solutions were quite computational; two used Mathematica. 
Solved also by M. Benedicty, V. Schindler (Germany), and the proposer. 


An Integral Giving Euler’s Constant 


10524 [1996, 427]. Proposed by Jean Anglesio, Garches, France. Show that 


4 /fsinx In x 
lim —— —cosx }] —dx=1-y, 
x 


u—>co 1/u X 
where y is Euler’s constant. 


Solution by Donald A. Darling, Newport Beach, CA. Transform 


y ; 
1) = | (TE — cos x) ar. 
1/u Xx Xx 


[ (: =) dx 

1/u I/u * * 

; | 

= (1- 3") nu ~ (1 usin) in — f (1- =) ax 
u u u L/u x x 


Use [ 1 /u (1+.x)—! dx = Inu to write the previous equation as 


Inu 1 4 fsinx x dx 
I(u) = —sinu—— + [1—usin— }Inu+ — —1+ —., 
u u iu \ x l+x/ x 


Since 1 — usin(1/u) ~ 1/(6u*) as u — oo, the first two terms on the right go to zero, and 
the integral over [0, oo) is absolutely convergent, so that 


rw) fo sin x 1 dx \ 
u — — —=1- 
0 x l+x/ x Y 


by a standard representation of Euler’s constant—see, for example, I. S. Gradshteyn and 
I. M. Ryzhik, Table of Integrals, Series and Products, formula 8.367(8) or 3.781(1). 


by integrating by parts: 


Editorial comment. Gaston Gonnet noted that the Maple computer algebra system immedi- 
ately returns the asserted value for the limit. 


Solved also by Z. Ahmed (India), D. Borwein (Canada), D. Bradley (Canada), R. J. Chapman (U. K.), E. Deutsch, 
M. Golomb, G. H. Gonnet (Switzerland), R. A. Groeneveld, K.-W. Lau (Hong Kong), M. Omarjee (France), R. Richberg 
(Germany), P. G. Rooney, N. S. Thornber, T. V. Trif (Romania), M. Vowe (Switzerland), C. Y. Yildirim (Turkey), GCHQ 
Problems Group (U. K.), NSA Problems Group, and the proposer. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


Knots and Surfaces: A Guide to Discovering Mathematics. By David W. Farmer and 


Theodore B. Stanford. American Mathematical Society, 1996, vii + 101, $19.00. 
Knots and Surfaces. By N. D. Gilbert and T. Porter. Oxford University Press, 1994, 
xi +268, $75.00 hardbound, $34.95 softcover. 


Reviewed by William D. Dunbar 


The two books under review, which I’ll refer to as [FS] and [GP], have the same 
title, but are very different in content and style. [FS] is, in its own words, ‘“‘a guide 
to discovering mathematics... suitable for a one semester course at the beginning 
undergraduate level [with] no prerequisites.” The discussion is driven by a succes- 
sion of “Tasks” (over 150 in the three main chapters on networks, surfaces and 
knots), which include both calculations and short proofs of the “explain why” sort. 
The six-color theorem for maps on a sphere is proven in the first chapter, 
orientable surfaces (with or without boundary) are classified in the second chapter, 
and the invariance of linking number and 3-colorability under Reidemeister moves 
is proven in the third chapter (asserting but not proving that this implies invariance 
under ambient isotopy). The final chapter consists of a list of ideas for projects or 
term papers. Some are generalizations of material presented earlier, some address 
the use of knots and surfaces in art, and some are analyses of games played on 
surfaces (e.g., tic-tac-toe on a torus). I feel that [FS] would serve well as the basis 
of an independent study course, in which the student would work through the tasks 
in a journal, subject to periodic review by the instructor (to monitor the proof-writ- 
ing). The writing is clear and engaging, and the tasks should be effective at setting 
a reasonable pace. One of the most memorable tasks for me was to label a map of 
the continental United States, which has been rotated and distorted homeomor- 
phically so that the states are polygons that fit together into a rectangle (territorial 
waters are included in the state, so Michigan has one component). Being from New 
England, I knew to start with Maine. 

If [FS] might better be titled “Graphs and Surfaces and Knots,” then [GP] is 
really “Groups and Knots and Surfaces and Graphs.” In its own words, it is “based 
on a third-year undergraduate course” at a British university. The first two 
chapters discuss knot diagrams and several of the knot polynomials defined for 
knot diagrams (Jones, Kauffman, HOMFLY, Alexander). The next two chapters 
introduce the basic notions of topology and apply them to the classification of 
surfaces (orientable or not, with or without boundary). Chapter 5 proves that the 
genus of a connected sum of two knots is the sum of the genera of the summands, 
and outlines a proof of the prime factorization of knots. Also the HOMFLY 
polynomial of a connected sum is proven to be the product of the polynomials of 
the summands. 
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Group theory begins in Chapter 6, with Cayley graphs, construction of free 
groups, group presentations, Tietze transformations, cyclic decomposition of finitely 
generated abelian groups, pushouts of groups, and the Wirtinger & Dehn presen- 
tations associated to a knot diagram. Chapter 7 switches to graphs, especially as 
embedded in surfaces, and closes by giving without proof the minimum genus 
surface in which the complete graph K, on n vertices can be embedded. Fox 
derivatives are used in Chapter 8 to give another facet of the Alexander polyno- 
mial. The following three chapters introduce the fundamental group of a topologi- 
cal space, prove the Seifert-Van Kampen theorem and apply it to surfaces and to 
knots. The final chapter is a general discussion of covering spaces, which are then 
applied to graphs, ending with the Schreier formula for the rank of a subgroup of 
given index in a free group of given rank. 

I am not enthusiastic about [GP] as a textbook, for two basic reasons: its 
narrative line is weak, and it is not well edited. By the former, I mean that the 
enormous investment in combinatorial group theory in the middle of the book does 
not seem to pay off adequately in terms of a better understanding of knots or 
surfaces. In particular, I found the ending of the text quite abrupt. As for the 
latter, I'll give a few examples of things that either mystified or annoyed me: 

(1) On page 52, a knot is defined as a map S' > R?°, but on page 53, a theorem 
asserts that two knots are ambient isotopic if and only if they have isotopic 
(unoriented) diagrams, which appears to be false for non-invertible knots such as 
8,7. Since this definition gives an implicit orientation to all knots, the later use of 
the term “oriented knots” on page 96 is confusing. 

(2) On page 94, the fact that the degree of the Alexander polynomial of a knot 
is no more than twice the genus of the knot is called “strange,” but it follows 
immediately from A(x) = det(V — xV’'), where V is a Seifert matrix for the knot 
(with 2g rows and columns, where g is the genus of the Seifert surface). 

(3) On page. 112 and throughout, group homomorphisms are misspelled as 
“homeomorphisms.” 

(4) On page 180, the reader is advised that “now would be a good time to look 
up the definitions” of an integral domain and a unique factorization domain. This 
statement is immediately followed by a string of definitions, including those two. 

(5) Repeated explanations of Lebesgue’s lemma and changes in the notation for 
paths make it appear that Chapters 9 and 10 were written by different people 
without consideration of the transition. w is used both for the reverse of the path 
w (on page 231) and for something quite different (on page 228). 

By contrast, I’ve taught a topology course from C. Livingston’s book [L], and feel 
that it gives a clearer picture of the wide variety of knot invariants, how two 
invariants can be related by an inequality or independent of each other, and how 
having several methods to calculate the same thing (such as the Alexander 
polynomial) can be intriguing as well as useful. The algebra is quickly applied; the 
discussion of permutations supports both the development of the concept of the 
knot group (via labelling edges in a knot diagram “consistently” with permutations) 
and also the later results on linking numbers of periodic knots with their axes of 
rotational symmetry. [L] ends with an open question, that of finding a knot that 
cannot be distinguished from the unknot by, say, the HOMFLY polynomial. 
Indeed, there are many references to open problems throughout the book, which 
reinforce the point that mathematics is an evolving subject, a point that undergrad- 
uates need not only to hear, but to see. 
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The Book of Numbers. By John Horton Conway and Richard K. Guy. Springer-Verlag, 
1996, 320, $29.95. 


Reviewed by Andrew Bremner 


Am I the only person ever to have claimed a pineapple as deduction against 
income tax? The arrival of Conway and Guy’s Book of Numbers may well mean 
that others follow suit when classroom teachers discover the pedagogical virtues of 
using the fruit to demonstrate the occurrence in nature of the Fibonacci numbers. 
What a delight it is to turn the pages of this book, being simultaneously enter- 
tained and enlightened by these masters of arithmetic mathesis. The planning of 
such a book displays enormous conceit, and bringing it to realisation is a remark- 
able achievement. Readers must enjoy discovering for themselves quite how 
successful the authors have been. With “numbers” as the theme of the book, the 
volume could well have been arbitrarily large; yet in practice it is quite a slim tome 
of just ten chapters (short of fourteen, one notes, the infinity of Borges). 

It is in fact admirably restrained in size; it is also beautifully produced by 
Springer, and comfortably priced to boot. What more could one ask? There is a 
clear readership amongst the devoted followers of Martin Gardner, and Scientific 
American’s Mathematical Games columns. Readers familiar with the two volume 
Winning Ways by Berlekamp, Conway, and Guy [1], will undoubtedly recognize 
inimitable matters of style common to both works, including many whimsically 
delightful illustrations. But the material of the Book of Numbers is intentionally far 
more accessible to the lay mathematician than that of Winning Ways, and can be 
perused with pleasure and profit by pappous schoolboy and pompous professor. 
There is truly something here for everyone. 

The first chapter starts with the language of number, from the hypothesis that 
“Hickory, Dickory, Dock” is a corruption of some rustic counting sequence for 
“Fight, Nine, Ten”, to the syntax of number names in different languages. As 
illustration, a table of cardinals in Welsh is given, where we learn for instance that 
“eighteen” occurs as “two times nine” and “fifty” as “half a hundred”. (Those of us 
who have slogged through classical Hawaiian will recall the intrigue at “seventy” 
translating as “forty with thirty remainder” despite the regular formations of the 
other multiples of ten.) There follow individual ruminations based on the numbers 
between one and a hundred. So we find the familiar, the French “quinzaine” 
denoting the same period of time as the English “fortnight”; and the less familiar 
—‘“punch”, a drink with five ingredients, from the Hindi word for five. 
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This whole section begs a parlour game with the aim of finding some literary 
reference to the integers between one and a hundred (maybe the Bible should be 
disbarred for this purpose. John Buchan provides an easy start; I suspect a search 
of Tristram Shandy would also prove fruitful.) The visual equivalent exists on 
celluloid in Greenaway’s “‘Drowning by Numbers”: the viewer becomes aware that 
the numbers from one to one hundred are occurring in sequence in the film, with 
an abrupt ending at the appearance of 100. 

We progress to historical number systems (and discover how Caesar dealt with 
fractions: for instance, the five spots on a die is the Roman quincunx, the symbol 
for five uncia, or five twelfths. Of course, there are still today twelve ounces in 
both the troy and the apothecary’s pound). Later in the book is a consequent 
discussion of tablet number 7289 from the Yale Babylonian Collection, an aston- 
ishing base 60 computation of the square root of 2. Its accuracy is such that the 
value is correct to six decimal places. Also shown is a photograph of the tablet 
Plimpton 322 which in Babylonian cuneiform appears to be a table of Pythagorean 
triangles (here duly completed by the authors, in Babylonian of course). 

The second chapter includes Patterns Providing Pretty Proofs, with some 
geometrically inspired formulations of series summations. The Ackermann num- 
bers are described, being the sequence 1, 4, 7625597484987, ... whose fourth term 
is so staggeringly immense that the cerebellum quivers merely at the thought of 
trying to comprehend it. Yet this is only a springboard for the authors’ “chained- 
arrow” numbers; these in turn are of such magnitude as guaranteed to leave your 
brainbox smoking. 

The following chapter concentrates on sequences, and in particular methods for 
determining the rule of formation of a sequence of integers. Little is assumed, so 
that binomial coefficients and Pascal’s triangle are introduced from first principles. 
But there is an interesting discussion of Difference Fans and of Number Walls, of 
which I choose the latter to illustrate here with an example. If standard differenc- 
ing techniques do not reveal the rule behind a sequence, then the sequence cannot 
be of polynomial type. To detect sequences of exponential type (linear recurrence 
relations) one forms a Number Wall: below a row of ones place the terms of the 
sequence, and then compute further rows in the wall using the rule that for each 
cross of five bricks, 


(centre-brick)” = (north-brick)(south-brick) + (east-brick) (west-brick) 
That is, for the cross 


WwW C E 
S 


C* = NS + EW. So for the Fibonacci sequence, for example, we develop the wall 


1 1 1 1 1 1 1 1 1 1 

0 1 1 2 3 5 8 13 21 34 

1 -l 1 -1 1 -l 1 -l 1 

0 0 0 0 0 0 0 0 
If the sequence is genuinely exponential, then ultimately a row of zeroes will 
appear, and the number of rows between the ones and zeroes gives the length of 
the recurrence relation, which is then easy to compute. What is so intriguing about 
this construction is the surely non-obvious fact that every brick will be computed as 
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an integer. To prove all the properties of the Number Wall will clearly afford 
excellent exercise. In more substantial vein, I apply the Number Wall technique to 
the Shallit sequence (see [2]): a,,, is the least integer such that 


an+1 a 
— > 


9 
n-1 


with a) = 8, a, = 55. The sequence is thus 8,55, 379, 2612, 18002,.... We con- 
struct the wall 


1 1 1 1 1 1 1 1 1 
8 S55 379 2612 18002 124071 855106 5893451 40618081 
—7 —-19 -214 —1448 —-5171 —87785 —82185 —4520276 


n 


—3 7 55 — 809 8515 —66185 501931 
—] —6 — 36 —216 — 1296 — 7776 
0 0 0 0 0 


which indicates a four term recurrence relation. Indeed, we appear to find 
a, = 6a,_, + 7a,_, —5a,_3 — 6a,_4. 


But beware, O constructor murorum! It is anti-intuitive that the sequence {a,} be 
exponential, and in fact with sufficient computer-aided effort it has been verified 
that the linear recurrence is valid only in the range 4 <n < 11055. The term 4,95, 
differs by 1 from that computed from the recurrence! It is instructive to estimate 
the size of a,1955, and contemplate the fact that the bottom row of the above wall 
starts out with 11051 zeroes before becoming non-zero. 

This is heady stuff, but we are propelled forward into Famous Families of 
Numbers, where the Great Arithmetician of Ulm, Johann Faulhaber, is finally 
given full credit for his discovery and use in the Academiae Algebrae (1631) of what 
later became known as the Bernoulli numbers. It was almost eighty years later that 
Bernoulli himself embarked on his extensive study of these numbers (with, it 
should be remarked, full due to Faulhaber). So we find Bell and Catalan, 
Ramanujan and Stirling, and of course Fibonacci. There is an extended discussion 
of the occurrence in nature of the Fibonacci numbers, from the edible sorosis to 
leaf phyllotaxis. A mature pineapple will display eight spirals of bracts in one 
direction, and thirteen in the other, clockwise and anticlockwise (a supermarket 
sortie found no counterexample to this rule; the lady who observed my protracted 
handling of every single pineapple advanced with the advice: “You should smell 
them, you know”). Similarly a sunflower head will usually display 34 and 55 spirals 
of seeds. The authors provide what seems a very plausible explanation for this 
phenomenon (“Say, bud, where do you think you are going?”). D’Arcy Thompson’s 
admirable On Growth and Form was a favourite tome of the childhood bookshelf, 
but is sadly remiss in failing even to mention the Fibonacci sequence. 

In the Primacy of Primes, we meet inter alia Conway’s famous Prime Producing 
Machine, and we are given the current status on the factorization of the Mersenne 
and Fermat primes. The latter, F, = 2*° + 1, is now known to be composite for 
5 <n < 23; factorizations are given for 5 <n < 11, and partial factorizations for 
n = 12,13. This is certainly very much up-to-date, with F,, and F,, having only 
just fallen (however the authors do miss two very recently announced Mersenne 
primes). 

Fruitfulness of Fractions is a rag-bag collection of Farey fractions, decimal 
expansions of prime reciprocals, shuffles, and Pythagorean triangles. Continued 
fractions, which themselves might merit an entire volume, are restricted to a 


886 REVIEWS [November 


mention in terms of the astronomical Metonic cycle, used to determine the Jewish 
calendar and the date of Easter. (What a shame Conway was not prevailed upon to 
add a page or two at this stage, even if not strictly relevant. Master of much, he is 
supreme in the quirks of the calendar, and I have savoured for many years his 
perspicacious observation that any Swede born on February 29 in the year 1696 
would not have celebrated a first birthday for another 48 years!) 

“Algebraic Numbers” leads to the ruler-and-compass construction of the regular 
polygons, and beautifully elegant constructions with appropriate diagrams are 
given for the 3-, 5-, 7-, 9-, 13-, and 17-gons (“But...” I hear you murmur; yes, 
indeed, the luxury of an angle-trisector is used for the 7-, 9-, and 13-gons.) Some 
problems are given whose solution will involve a specific algebraic number, for 
instance, that of finding a hexagon of largest area given that no two vertices are 
more than one unit of distance apart. Ron Graham solves the problem with a 
hexagon of area A, with A satisfying an irreducible polynomial of degree 10 
(A = 0.674981...; the area of a regular hexagon of unit side is 373 /8 = 
0.649519... ). There is even a somewhat contrived problem whose solution involves 
the root of a polynomial of degree 71. It should perhaps be stressed that 
throughout the book, all these deeper results are simply quoted, and each chapter 
has a sizeable bibliography referring the reader to research papers where neces- 
sary, and to appropriate literature in general. 

The remaining chapters include sections on imaginary and transcendental 
numbers, with mention of the connection between Euler’s prime-producing polyno- 
mial n? —n +41 and the fact that e7V! is an integer. (““But...” I hear you 
mutter again: you must simply go away and compute, but make sure to give 
yourself at least 31 significant figures.) There is also a nice discussion of Gregory 
numbers, where the name of Lewis Carroll arises. 

Finally, in the unique style of the authors, there is a chapter on the infinite and 
the infinitesimal. We are shown how to add and multiply, just as if infinities 
occurred every day in our cheque-books. Surrealism comes into play, as it were, 
but it’s only a game. Sorry. You just have to get hold of a copy of this book; trying 
to summarize adequately the 320 pages of Conway & Guy is as demanding an 
exercise as extracting the plums from a particularly rich Christmas pudding. 

So the book has multifarious virtues: what are its faults? Being ever greedy, one 
can lament what the authors have chosen to omit, as much as rejoice in what they 
include. Here again is another parlour game, to alphabetize missing topics that 
perhaps merit mention: amicable numbers and Alcuin’s sequence, Beatty se- 
quences, congruent numbers... There is a slightly irritating and curious inconsis- 
tency with the chosen type-faces. Springer have chosen a rather spidery font for 
some of the tables, which can render the content impactless. For instance, Table 
6.1 lists those repeating decimals that occur in fractions with denominator a given 
prime. One of the features that should leap to the eye is the equal length of the 
entries for a given prime: 027, 054, 081,... for the prime 37. As it stands, however, 
entries such as that at 53 look anything but equal in length. Yet other tables, such 
as Table 1.4, decimal expansions of “some of our favourite numbers”, are set 
perfectly. Which is not in the index; at least, “perfect” is not in the index. A little 
unfortunate, since this was the very first item that I looked up on receiving the 
book (perfect numbers are indeed mentioned on pages 136-137). The index also 
mispaginates at least one item. Tut. However, if such be the sum total of faults, 
then there is not much cause for curmudgeonly grumble. 

It is clear that this eclectic review can only begin to convey the pleasure that this 
book has provided. Throughout, the authors communicate their enthusiasm and 
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exuberance with great éclat. There is joy at being in the care of mathematicians 
who delight in the sheer fnendliness of numbers. The lucid explanations and 
insights can be startling and impressive. Gounod set to music in his curiously 
admonitory duet “L’Arithmétique”: 


Cultiver cet art salutaire 

C’est apprendre a garder son bien, 
Car, mes amis, sur cette terre, 
Sachez, qu’on a souvent affaire 

A des gens qui comptent trop bien. 


How fortunate that these authors who count so much better than most of us have 
imparted their wisdom to the printed medium. In Japan at New Year, the 
takarabune is the treasure-laden ship of the Gods of Good Fortune. Here is our 
ship, with Conway and Guy the Bringers of Happiness. 
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Mathematics Appreciation, T(13: 1). Math- 
ematics: Its Power and Utility, Fifth Edition. 
Karl J. Smith. Brooks/Cole, 1997, xv + 
588 pp, $61.95. [ISBN 0-534-34462-3] New 
for this edition: more historical information; 
more problems asking students to explain con- 
cepts in their own words; glossary of important 
terms; appendices on Hindu—Arabic numbers 
and enumeration systems. (Fourth Edition, TR, 
January 1995.) LB 


Mathematics Appreciation, S, L. Power Play. 
Edward J. Barbeau. Spectrum Ser. MAA, 1997, 
xi + 185 pp, $29 (P). [ISBN 0-88385-523- 
2] Lots of neat stuff about powers of integers, 
Pythagorean triples, Pell’s Equation, Catalan 
Conjecture, and more. Includes a collection of 
“interesting sets” (e.g., {7, 18} is “interesting” 
because 7? = 1+18+187). Good exercises. JO 


Finite Mathematics, T(13-14: 1). Ap- 
plied Finite Mathematics, Fifth Edition. S.T. 
Tan. Brooks/Cole, 1997, xvii + 674 pp, 
$73.95. [ISBN 0-534-95562-2] Business ori- 
ented; emphasis on linear models (systems, 
linear programming), finance, sets and count- 
ing, elementary probability, Markov chains and 
games. Many applied examples. (Third Edi- 
tion, TR, August-September 1990.) RM 


Education, P, L. Many Visions, Many Aims, 
Volume I. William H. Schmidt, et al. Kluwer 
Academic, 1997, ix + 276 pp, $120. [ISBN 0- 
7923-4436-7] <A cross-national investigation 
of curricular intentions in school mathematics 
based on an analysis of textbooks and curricu- 
lum guides in almost fifty countries participat- 
ing in the Third International Mathematics and 
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Science Study (TIMSS). Key finding: varia- 
tion (“noise”) was pervasive “in myriad ways 
both small and large,’ dominating commonal- 
ities (“signal’’) across both countries and top- 
ics. LAS 


Education, P, L*. A Splintered Vision: An 
Investigation of U.S. Science and Mathemat- 
ics Education. William H. Schmidt, Curtis 
C. McKnight, Senta A. Raizen. Kluwer Aca- 
demic, 1997, 163 pp, $87. [ISBN 0-7923-4440- 
5] Summary report of U.S. curricula, text- 
books, and instructional practice drawn from 
the Third International Science and Mathemat- 
ics Study (TIMSS). Chief conclusion: U.S. texts 
include, and U.S. teachers teach, far more top- 
ics than is typical in other nations. This lack 
of focus, the authors assert, is due to the ab- 
sence of any “coherent vision” of U.S. school 
mathematics. LAS 


Education, S(15-18), P. Assessment Stan- 
dards for School Mathematics. NCTM, 1995, 
ix + 102 pp, $15 (P). [ISBN 0-87353-419- 
0] Third book in NCTM Standards trilogy. 
Assumes all students can meet high standards 
and that assessment is interwoven with instruc- 
tion. Addresses four purposes of assessment: 
monitoring students’ progress, making instruc- 
tional decisions, evaluating students’ achieve- 
ment, and evaluating programs. Presents six 
standards: mathematics, learning, equity, open- 
ness, inferences, and coherence. Numerous ex- 
amples. MW 


Education, S(17-18), P. Mathematics for To- 
morrow’s Young Children: International Per- 
spectives on Curriculum. Eds: Helen Mans- 
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field, Neil A. Pateman, Nadine Bednarz. Math. 
Educ. Lib., V. 16. Kluwer Academic, 1996, xi+ 
327 pp, $128. [ISBN 0-7923-3998-3] Papers 
from a Working Group at ICME~7 focus on 
concept development and conceptual change, 
including the influences of social and cultural 
environments, the roles of language and sym- 
bolism, and the interactions between teachers’ 
curriculum decisions and their beliefs about 
concept formation. Strong focus on the role 
of social interaction in students’ construction of 
mathematical ideas, and on similarities and dif- 
ferences among various constructivistic types 
of learning theories. MW 


Education, §(17-18), P. Approaches to Alge- 
bra: Perspectives for Research and Teaching. 
Eds: Nadine Bednarz, Carolyn Kieran, Lesley 
Lee. Math. Educ. Lib., V. 18. Kluwer Aca- 
demic, 1996, xv +345 pp, $145. [ISBN 0-7923- 
4145-7] Collection of research papers on the 
development of algebraic thinking among sec- 
ondary school students. Focus is on didactic 
questions such as: what students should know, 
what learning is prerequisite, what obstacles to 
learning arise, what situations promote alge- 
braic thinking. Questions are discussed in the 
context of four conceptions of algebra: (1) as 
generalizations of numerical and geometric pat- 
terns and laws governing numerical relations; 
(2) as a means of solving of specific problems 
or classes of problems; (3) as a tool for model- 
ing physical phenomena; and (4) as a study of 
the concepts of variable and function. MW 


Education, S(17-18), P. Gender and Mathe- 
matics Education. Eds: Barbro Grevholm, Gila 
Hanna. Lund Univ Pr, 1995, 428 pp. [ISBN 
91-7966-276-5] Proceedings of a 1993 Inter- 
national Commission on Mathematical Instruc- 
tion (ICMI) conference. Explores the causes, 
manifestations, and possible solutions for gen- 
der inequities in mathematics participation and 
persistence. Papers address specific issues and 
present “snapshots” of gender issues in nearly 
a dozen countries. MW 


History, P, L. A Logical Journey: From Gédel 
to Philosophy. Hao Wang. MIT Pr, 1996, 
xiv + 391 pp, $40. [ISBN 0-262-23189-1] A 
continuation of the author’s Reflections on Kurt 
Gédel (TR, January 1988). Reports the author’s 
conversations with Gédel on interpretation of 
his published work. Fascinating chapters on 
mind vs. machine and on Gédel’s Platonism and 
objectivism. An example: Gédel believed both 
‘“.,. mathematics describes a non-sensual real- 
ity which exists independently . ..of the human 
mind” and that his 1951 Gibbs Lecture proved 
this. SK 
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Number Theory, T(18: 1), P*. Duality in An- 
alytic Number Theory. P.D.T.A. Elliott. Tracts 
in Math., V. 122. Cambridge Univ Pr, 1997, 
XViii + 341 pp, $59.95. [ISBN 0-521-56088- 
8] A mathematical autobiography: one of the 
masters explains the tools of analytic number 
theory, especially sieve theory, within the con- 
text of how he has used them and how he thinks 
about them. DB 


Number Theory, T(18: 1), P. The Hardy— 
Littlewood Method, Second Edition.  R.C. 
Vaughan. Tracts in Math., V. 125. Cambridge 
Univ Pr, 1997, xiii + 232 pp, $49.95. [ISBN 
0-521-57347-5] Expanded to include recent 
results on Waring’s problem, especially Woo- 
ley’s result that G(k) is bounded by k log k + 
k loglogk + O(k). (First Edition, TR, March 
1982.) DB 


Number Theory, P. Surfing on the Ocean 
of Numbers—A Few Smarandache Notions and 
Similar Topics. Henry Ibstedt. Erhus Univ Pr, 
1997, 75 pp, $9.95 (P). [ISBN 1-879585-57-X] 


Group Theory, T(18: 1), S, P.. Automor- 
phic Forms and Representations. Daniel Bump. 
Stud. in Adv. Math., V. 55. Cambridge Univ 
Pr, 1997, xiv + 574 pp, $79.95. [ISBN 0- 
521-55098-X] Written at a level intermediate 
between an advanced text and a monograph. 
The four chapters, which can be read inde- 
pendently, are Modular Forms, Automorphic 
Forms and Representations of GL(2, R), Auto- 
morphic Representations, and Representations 
of GL(2) Over a p-adic Field. JS 


Group Theory, P. Linear Algebra and Group 
Theory for Physicists. K.N. Srinivasa Rao. Wi- 
ley, 1996, 623 pp, $34.95. [ISBN 0-470-22061- 
9] Starts with basic group theory and linear 
algebra, then covers representation theory in- 
cluding representations of finite groups, linear 
associative algebras, the symmetric group, the 
rotation group, crystallographic point groups, 
and the Lorentz group. LC 

Group Theory, T(18), P. Low Rank Repre- 
sentations and Graphs for Sporadic Groups. 
Cheryl E. Praeger, Leonard H. Soicher. Aus- 
tralian Math. Soc. Lect. Ser., V. 8. Cambridge 
Univ Pr, 1997, xi + 141 pp, $39.95 (P). [ISBN 
0-521-56737-8] 

Algebra, P, L*. Finite Fields. Rudolf Lidl, 
Harald Niederreiter. Ency. of Math. & Its Ap- 
plic., V. 20. Cambridge Univ Pr, 1997, xiv + 
755 pp, $95. [ISBN 0-521-39231-4] Reprint 
of an excellent resource. (1983 Addison- 
Wesley edition, TR, April 1987.) CEC 


Algebra, T(18: 2). Groups and Characters. 
Larry C. Grove. Pure & Appl. Math. Wiley, 
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1997, viii + 212 pp, $54.95. [ISBN 0-471- 
16340-6] Focuses primarily on finite groups: 
character theory, transfer and splitting, Frobe- 
nius groups. Includes some computational as- 
pects: Schreier—Sims algorithm, Todd—Coxeter 
coset enumeration, etc. LC 


Algebra, T(18), P. Representation Theory 
and Complex Geometry. Neil Chriss, Vic- 
tor Ginzburg. Birkhauser Boston, 1997, x 
+ 495 pp, $64.50. [ISBN 0-8176-3792-3] 
Overview of recent advances in representation 
theory—particularly for Weyl groups, SL, (C), 
and affine Hecke algebras—emphasizing the 
geometric aspects: Springer theory, K-theory, 
flag varieties, perverse sheaves. Self-contained, 
with many new results and new approaches. Ac- 
cessible to students of Lie theory. TH 


Differential Equations, P. Frequency Meth- 
ods in Oscillation Theory. G.A. Leonov, I.M. 
Burkin, A.I. Shepeljavyi. Math. & Its Applic., 
V. 357. Kluwer Academic, 1996, xii + 403 pp, 
$196. [ISBN 0-7923-3896-0] Criteria for the 
existence of periodic solutions to ODEs. Trans- 
lation of Russian original. SK 


Partial Differential Equations, P. Zime De- 
pendent Problems and Difference Methods. 
Bertil Gustafsson, Heinz-Otto Kreiss, Joseph 
Oliger. Pure & Appl. Math. Wiley, 1995, xi 
+ 642 pp, $79.95. [ISBN 0-471-50734-2] _ In- 
troduction to PDEs and numerical methods for 
them. Directed at physical scientists and engi- 
neers. SK 


Differential Geometry, P. Geometry of Non- 
positively Curved Manifolds. Patrick B. Eber- 
lein. Lect. in Math. Ser. Univ of Chicago 
Pr, 1996, vii + 449 pp, $45 (P); $90. [ISBN 
0-226-18198-7; 0-226-18197-9] Presentation 
of recent results on manifolds of non-positive 
sectional curvature, leading up to a proof of the 
Mostow Rigidity Theorem. JO 


Differential Geometry, T(18: 1). Differential 
Geometry: Cartan’s Generalization of Klein’s 
Erlangen Program. R.W. Sharpe. Grad. Texts 
in Math., V. 166. Springer-Verlag, 1997, xix + 
421 pp, $49.95. [ISBN 0-387-94732-9] De- 
velopment of differential geometry leading to 
the global version of Cartan connections. The 
exposition is nicely influenced by the history 
of the subject, and by the author’s question of 
modern differential geometry: “What [has] be- 
come of the curves and surfaces?” JO 


Mathematical Modeling, S(13-15), L.  In- 
terdisciplinary Lively Application Projects 
(ILAPs). Ed: David C. Arney. MAA, 1997, 
xii + 222 pp, $27.50 (P). [ISBN 0-88385- 
706-5] Eight interdisciplinary projects, akin 
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to UMAP modules, followed by six brief ar- 
ticles on the philosophy of their development 
and use. Projects range from the economics 
of building a deck to energy consumption on 
a mountain bike, from modeling smog in Los 
Angeles to heavy metal contaminants in an 
aquifer. Prerequisites range from precalculus 
to advanced calculus and differential equations. 
Each project includes instructions for students, 
comments for instructors, and sample solutions. 
A product of the NSF-supported INTERMATH 
project centered at West Point. LAS 


Probability, T(17), P. A Weak Convergence 
Approach to the Theory of Large Deviations. 
Paul Dupuis, Richard S. Ellis. Ser. in Prob. 
& Stat. Wiley, 1997, xvii + 479 pp, $69.95. 
[ISBN 0-471-07672-4] Shows how to apply 
weak convergence theory in a consistent way 
to prove large deviation results. Develops the 
weak convergence approach from scratch, illus- 
trates it via examples, applies it to sophisticated 
models. KB 


Mathematical Statistics, T(14—16). Statistical 
Methods: A Geometric Primer. David J. Sav- 
ille, Graham R. Wood. Springer-Verlag, 1996, 
xi + 268 pp, $39.95 (P). [ISBN 0-387-94705- 
1] Unique geometric perspective on funda- 
mental statistical methods. Focuses on pro- 
viding insights into mathematical foundations 
of the methods, yet all topics introduced and 
motivated through scientific problems and data 
analyses. Graphics enhance understanding and 
exercises are interesting/appropriate. MK 


Statistical Methods, T(18), P, L. Sequential 
Estimation. Malay Ghosh, Nitis Mukhopad- 
hyay, Pranab K. Sen. Ser. in Prob. & Stat. Wi- 
ley, 1997, xiv + 480 pp, $69.95. [ISBN 0-471- 
81271-4] Classical and modern techniques in 
sequential estimation including parametric and 
nonparametric methods. Topics: shrinkage, 
empirical and hierarchical Bayes procedures, 
time-sequential estimation, empirical and hi- 
erarchical populations sampling, reliability es- 
timations, and capture-recapture methodology 
leading to sequential schemes. KB 


Statistical Methods, S(17-18). Statistical 
Tools for Nonlinear Regression: A Practical 
Guide with S-PLUS Examples. Sylvie Huet, 
et al. Ser. in Stat. Springer-Verlag, 1996, 1x + 
154 pp, $42.95. [ISBN 0-387-94727-2] Well- 
written text. Includes numerous examples, the- 
oretical constructs, and fully-worked applica- 
tions. Topics include nonlinear regression and 
parameter estimation; interval estimation and 
testing; variance estimation; diagnosing model 
misspecifications; calibration and prediction. 
No exercises. MK 
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Statistical Methods, P. Applied Wavelet Anal- 
ysis with S-PLUS. Andrew Bruce, Hong-Ye 
Gao. Springer-Verlag, 1996, xxi + 338 pp, 
$49.95 (P). [ISBN 0-387-94714-0] Beauti- 
ful book introduces wavelet theory through sci- 
entific applications and data analyses, includ- 
ing image and speech analyses. Approach 
is graphical/visual. Written for scientists and 
researchers interested in applying these tech- 
niques. No exercises. MK 


Statistical Methods, T(17-18: 1), S. The 
EM Algorithm and Extensions. Geoffrey J. 
McLachlan, Thriyambakam Krishnan. Ser. in 
Prob. & Stat. Wiley, 1997, xvii + 274 pp, 
$59.95. [ISBN 0-471-12358-7] Excellent re- 
source for theory and application of EM algo- 
rithm and its many variations. Applications 
range from simple, one-parameter multinomi- 
als to hidden Markov models, epidemiologi- 
cal models, and neural networks. No exer- 
cises. MK 


Statistical Methods, T(16—-18: 1, 2). Modern 
Regression Methods. Thomas P. Ryan. Ser. 
in Prob. & Stat. Wiley, 1997, xix + 515 pp, 
$64.95, with disk. [ISBN 0-471-52912-5] 
Thorough survey covers simple linear regres- 
sion through nonparametric, robust, and non- 
linear regression. Plenty of good exercises and 
data sets. Could be used for introductory re- 
gression course or for more advanced methods 
course. MK 


Statistics, T(13-14: 1). Statistics from Scratch: 
An Introduction for Health Care Profession- 
als. David Bowers. Wiley, 1996, vi + 180 pp, 
$19.95 (P). [ISBN 0-471-96325-9] Survey of 
introductory statistical topics for health science 
students. Tone makes material accessible. In- 
cludes standard introductory topics as well as 
stratified sampling plans, cohort studies, case- 
control studies, and clinical trials. Concepts 
stressed without mathematical detail. Relies on 
accessible software. Numerous exercises. MK 


Theory of Computation, T(16-17: 1, 2), L. 
Computability and Complexity: From a Pro- 
gramming Perspective. Neil D. Jones. Found. 
of Computing. MIT Pr, 1997, xvi+ 466 pp, $45. 
[ISBN 0-262-10064-9] Interesting, integrated 
treatment of models of computation (program- 
ming languages, TM’s, lambda calculus, infer- 
ence systems, etc.), and complexity, based on 
and motivated by notions and techniques from 
programming language perspective, via basic 
model of LISP-like programs. Nicely exploits 
theoretical and practical connections (self eval- 
uators, partial evaluation). RM 


Computer Science, T(16—17: 1), P. Specifica- 
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tion of Abstract Data Types. Jacques Loeckx, 
Hans-Dieter Ehrich, Markus Wolf. Wiley, 
1996, xi+ 260 pp, $60. [ISBN 0-471-95067-X] 
Introduction to the theory of algebraic program 
specification. Unlike some texts in the field, this 
one is intended to be accessible to students un- 
familiar with category theory. Includes design 
of a lexical analyzer as a case study. JO 


Applications (Economics), P. Prices, Cycles, 
and Growth. Hukukane Nikaido. Stud. in Dy- 
nam. Econ. Sci. MIT Pr, 1996, xix + 285 pp, 
$40. [ISBN 0-262-14059-4] Fourteen of the 
author’s research papers analyzing the stabil- 
ity of equilibria and cycles in various economic 
models. SK 

Applications (Engineering), T(15-17: 1, 2), 
S, P. Fuzzy Engineering. Bart Kosko. Prentice 
Hall, 1997, xxvi + 549 pp, with disk. [ISBN 
0-13-124991-6] Fuzzy systems are approxi- 
mations F : R” — R? modeling an associa- 
tive processor given by m rules of the form 
“if X is a fuzzy set A, then Y is a fuzzy set 
B.’ Approach departs from the common “‘lin- 
guistic” view (fuzziness modeling, how we rea- 
son with rules of thumb, etc.), to perspective 
of function approximation with additive fuzzy 
systems. Treats chaos and control, signal pro- 
cessing, communication, etc. RM 
Applications (Physical Science), S(17-18), P. 
The Sentinel Method and Its Application to En- 
vironmental Pollution Problems. Jean-Pierre 
Kernévez. Math. Modelling Ser. CRC Pr, 
1997, 204 pp, $69.95. [ISBN 0-8493-9630- 1] 
Describes a method for solving parameter es- 
timation problems for environmental pollution 
models for which data are incomplete. LB 
Applications (Physical Science), S(11-14), L. 
Mission Mathematics: Grades 9-12. Peggy 
House. NCTM, 1997, vi + 121 pp, $19.95 (P). 
[ISBN 0-87353-436-0] An innovative collec- 
tion of authentic case studies of mathematical 
ideas behind space science: launch windows, 
orbital debris, global positioning systems, ellip- 
tical orbits, communication links. Each chap- 
ter introduces its subject with extensive refer- 
ences to real events and data, then is followed 
by student assignments on reproducible work- 
sheets. LAS 
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an undergraduate Russian major (A.B. Dartmouth) before turning to statistics in graduate 
school (M.S. Medical College of Virginia, Ph.D. Harvard). He currently chairs the Joint 
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Gould. 


DAVID MOORE was trained in mathematics (A.B. Princeton, Ph.D. Cornell). His Cornell 
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John Horton Conway’s unique approach to quadratic forms 
was the subject of the Hedrick Lectures given by him in 
August 1991 at the Joint Meetings of the American Mathemat- 
ical Society and the Mathematical Association of America in 
Orono, Maine. Professor Conway provides the following 
overview of those lectures: 


“I have been interested in quadratic forms for many years, 
but keep on discovering new and simple ways to understand 
them. The “topograph” of the First Lecture makes the entire 
theory of binary quadratic forms so easy that we no longer 
need to think or prove theorems about these forms—just 
look! In some sense the experts knew something like this 
picture—but why did they use it only in the analytic theory, 
rather than right from the start? 


Since sight and hearing were not involved, I took as the 
theme of the lectures the idea that one should try to appreci- 
ate quadratic forms with all one’s senses, and so arose the 
title “The Sensual Form” for my Hedrick Lectures, and also 
the topics for the first two of them. 


I could not settle on a single topic for the third of these lec- 
tures, even when I came to give it. So, in the end, I split it 
into two half-hour talks, one on the shape of the Voronoi cell 
of a lattice, and one on the Hasse-Minkowski theory. In this 
book, each of the these has become a fully-fledged lecture. 


The book should not be thought of as a serious textbook on 
the theory of quadratic forms—it consists rather of a number 
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of essays on particular aspects of quadratic forms that have 
interested me. The lectures are self-contained and will be 
accessible to the generally informed reader who has no par- 
ticular background in quadratic form theory. The minor 
exceptions should not interrupt the flow of ideas. The After- 
thoughts to the Lectures contain discussions of related mat- 
ters that occasionally presuppose greater knowledge. 


Since so much of the treatment is new to this book, it may 
not be easy to circumvent one’s difficulties by reference to 
standard texts. I hope the work pays off, and that even the 
experts in quadratic forms will find some new enlighten- 
ment here.” —John Horton Conway 
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Second Lecture: Can You Hear the Shape of a Lattice?; 
Afterthoughts: Kneser’s Gluing Method: Unimodular Lattices; 

Third Lecture: ...and Can You Feel its Form?; Afterthoughts: 
Feeling the Form of a Four-Dimensional Lattice; 

Fourth Lecture: The Primary Fragrances; Afterthoughts: 
More about the Invariants: The p-Adic Numbers; 

Postscript: A Taste of Number Theory. 
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Underwood Dudley has done it again with a witty, fascinating book about 
number mystics. If you enjoyed Underwood Dudley’s Mathematical Cranks, 
you must buy this book. 


Underwood Dudley has assembled another delightful 
collection of essays that will amuse, engage and 
instruct you. Dudley, author of the immensely popular 
MAA titles Mathematical Cranks, and The Trisectors, 
has turned his attention in this volume to numerolo- 
gists. Once you start reading about them, you won't 
be able to put the book down. 


We learn in the introduction: 
“For some people, numbers do much more than 
merely count and measure. For some people, 
numbers have meanings, they have inwardness, 
they can be magic and versatile, or young and 
sprightly. I am not one of those people, since I 
think numbers have quite enough to do as it is, 
but for the crowd of number mystics, numerolo- 
gists, pyramidologist, number-of-the-beasters, and 
others whose ideas and work will be described in 
the following chapters numbers have powers far 
out of the ordinary.” 


Number mystics, Dudley explains, originated with 
Pythagoras 2500 years ago and continue to this day. 
Numerology is applied number mysticism and is a 
more recent invention. You will find a history of 
number mysticism and numerology in the book, with 
a wealth of examples from the past as well as the 
present. Meet the Elliott Wave Theorists (who 
explain the movement of the stock market with 
Fibonacci numbers); the Bible-numberists who find 
7s, 11s, 13s, or perfect squares in the Bible; the 
researcher who finds 57s throughout the American 
Revolution; the pyramidologists who see all of 
human history in numbers derived from measure- 
ments of the great pyramid of Egypt, and much 
more. Meet them all in the pages of this wonderful 
new book. 
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Elementary Mathematical Models (EMM) claims a middle ground 
between college algebra and liberal arts mathematics. Like the cok 
lege algebra course, EMM emphasizes the elementary functions of 
analysis: linear, quadratic, polynomial, and rational functions; 
square roots; exponentials and logarithms. These functions are the 
building blocks for the simple models that appear in first courses 
in the physical, life, and social sciences. And while EMM does not 
stress algebraic manipulation as an end in itself, it does recognize 
how important algebra is. Moreover, it provides students with the 
opportunity to see for themselves why algebra is needed, and 
what it contributes to formulating and analyzing models. 


Like the liberal arts mathematics course, EMM makes a con- 
certed effort to convey something of the scope, power, and 
fascination of mathematics, to students who may never study 
mathematics again. Each mathematical topic evolves natu- 
rally in formulating simple discrete models for inherently 
interesting contexts. For example, exponential functions 
emerge from the study of models exhibiting geometric 
growth—defined as growth that increases by equal propor- 
tions in equal periods of time. Throughout the course, a 
recurring theme is the evolution from simple recursive 
hypotheses ( e.g., the population next year will be 10% 
greater than this year), to difference equations (p, + 1 = 

1.1 p,,), to solutions (p,, = Pp 1.1), to qualitative behavior 
of models. This theme appears repeatedly as the students 
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Mathematical Models 


Order Aplenty and a Glimpse of Chaos 


Series: Classroom Resource Materials 


New in the Classroom Resource Materials Series...suitable 
as a text in college algebra, finite mathematics, precaiculus, 
and liberal arts mathematics courses 


encounter a series of increasingly sophisticated growth mod- 
els, starting with arithmetic growth and ending with logistic 
growth. The course climaxes with an exploration of the 
chaotic behavior that can occur in logistic growth models. 


The presentation is accessible even to students with a weak 
algebraic background. Throughout the book, numerical, graphi- 
cal, and symbolic approaches are used systematically. There is 
a rich collection of examples and exercises. Reading compre- 
hension exercises in each chapter provide a strong emphasis 
on reading and writing about mathematical concepts. 


Contents: Overview; Sequences and Differences Equations; 
Arithmetic Growth; Linear Graphs, Functions and Equations; 
Quadratic Growth Models; Quadratic Graphs, Functions and 
Equations; Polynomials and Rational Functions; Fitting a 
Line to Data; Geometric Growth; Exponential Functions; 
More on Logarithms; Geometric Sums and Mixed Models: 
Logistic Growth; and Chaos in Logistic Models. 
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Miscellaneous Problems and Essays 
Series: Dolciani Mathematical Expositions 


Ross Honsberger 


Another elegant collection of problems from Ross Honsberger 


The study of mathematics is often undertaken with represented —- combinatorics, geometry, number 

an air of such seriousness that it doesn’t always seem theory, algebra, probability, .... The sections may be 
to be much fun at the time. However, it is quite read in any order. The book concludes with twenty- 
amazing how many surprising results and brilliant five exercises and their detailed solutions. 

arguments one is in a position to enjoy with just a 

high school background. This is a book of Something to delight will be found in every section 
miscellaneous delights, presented not in an attempt —— a surprising result, an intriguing approach, a stroke 
to instruct but as a harvest of rewards that are due of ingenuity — and the leisurely pace and generous 
good high school students and, of course, those explanations make them a pleasure to read. 

more advanced — their teachers, and everyone in 

the university mathematics community. Admittedly, The inspiration for many of the problems came from 
they take a littlhe concentration, but the price is a the Olympiad Corner of Crux Mathematicorum, 
bargain for such gems. published by the Canadian Mathematical Society. 

A half dozen essays are sprinkled among some Catalog Code: DOL-19/JR 

hundred problems, most of which are the easier 328 pp., Paperbound, 1997 

problems that have appeared on various national and ISBN 0-88385-326-4 
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FOURTH EDITION © 


eg BA . 


"HAROLD P.BOAS * 


The Carus Mathernatical Monggraphs, 


Nurmber 13 


This is a revised, updated and augmented edition of 
a classic Carus monograph (a bestseller for over 25 
years) on the theory of functions of a real variable. 
Earlier editions of this classic Carus Monograph cov- 
ered sets, metric spaces, continuous functions, and 
differentiable functions. The fourth edition adds sec- 
tions on measurable sets and functions, the Lebesgue 
and Stieltjes integrals, and applications. The book is 
accessible to readers with some mathematical sophis- 
tication and a background in calculus. It is suitable 
either for self-study or for supplemental reading in a 
course on advanced calculus or real analysis. 


Not intended as a systematic treatise, this book has 
more the character of a sequence of lectures on a 
variety of topics connected with real functions. 
Many of these topics are not commonly encountered 
in undergraduate textbooks: for example, the exis- 
tence of continuous everywhere-oscillating functions 
(via the Baire category theorem); two functions hav- 
ing equal derivatives, yet not differing by a constant, 
application of Stieltjes integration to the speed of 
convergence of infinite series. 
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Revised and updated by Harold P. Boas 


Series: Carus Mathematical Monograph 


Table of Contents: 

I. Sets: Sets of real numbers, Countable and uncount- 
able sets, Metric spaces, Open and closed sets, Dense 
and nowhere dense sets, Compactness, Convergence 
and completeness, Nested sets and Baire’s theorem, 
Some applications of Baire’s theorem, Sets of mea- 
sure zero. II. Functions: Functions, Continuous func- 
tions, Properties of continuous functions, Upper and 
lower limits, Sequences of functions, Uniform con- 
vergence, Pointwise limits of continuous functions, 
Approximations to continuous functions, Linear func- 
tions, Derivatives, Monotonic functions, Convex func- 
tions, Infinitely differentiable functions. II. 
Integration: Lebesgue measure, Measurable functions, 
Definition of the Lebesgue integral, Properties of 
Lebesgue integrals, Application of the Lebesgue inte- 
gral, Stieltjes integrals, Applications of the Stieltjes 
integral, Partial sums of infinite series. 


Catalog Code: CAM-13R/JR 

262 pp., Hardcover, 1996 

ISBN 0-88385-029-X 

List: $35.95 MAA Member: $24.95 


Phone tn Your Order Now! ®@ 1-800-331-1622 


Monday — Friday 8:30 am — 5:00 pm 


FAX (301) 206-9789 


or mail to:The Mathematical Association of America, PO Box 91112, Washington, DC 20090-1112 


Shipping and Handling: Postage and handling are charged as follows: USA orders (shipped via UPS): $2.95 for the first book, and $1.00 for each additional book. 
Canadian orders: $4.50 for the first book and $1.50 for each additional book. Canadian orders will be shipped within 10 days of receipt of order via the fastest avail- 
able route. We do not ship via UPS into Canada unless the customer specially requests this service. Canadian customers who request UPS shipment will be billed an 
additional 7% of their total order. Overseas orders: $3.50 per item ordered for books sent surface mail. Airmail service 1s available at a rate of $7.00 per book. 
Foreign orders must be paid in US dollars through a US bank or through a New York clearinghouse. Credit Card orders are accepted for all customers. 


Address 


City State Zip 


Phone 


QTY: CATALOG CODE PRICE AMOUNT 
CAM-13R/JR 

All orders must be prepaid with the — Shipping & handling 

exception of books purchased for 

resale by bookstores and wholesalers. TOTAL 

Payment [L] Check CJ VISA LJ MasterCard 

Credit Card No. Expires J 


Signature 


Cooperative Learning 
for Undergraduate 
Mathematics 


Series: MAA Notes 


THE MATHEMATICAL ASSOCIATION OF AMERICA oo 


Readings in 
Cooperative Learning for 
Undergraduate Mathematics 


Ed Dubinsky, David Mathews, Barbara E. Reynolds, Editors 


An invaluable guide for teachers interested in new approaches to student learning 


The editors have combed through the literature in 
the area of cooperative learning, to select the 17 
papers presented here. They have used this material 
themselves in the various summer workshops, mini- 
courses at regional and national meetings, and World 
Wide Web courses conducted on behalf of the 
MAA’s project CLUME (Cooperative Learning in 
Undergraduate Mathematics Education). This project 
has received substantial funding from the National 
Science Foundation. 


To help you begin to assimilate the material, the edi- 
tors have provided, for each reading, a brief introduc- 
tion and a number of questions that can be used for 
discussion. The papers are organized into three cate- 
gories that represent major aspects of cooperative 
learning and its foundations in learning theory: 


¢ Constructivism and the Teacher’s Role—papers 
concerned with the theoretical basis for coopera- 


tive learning, how it relates to the traditional role 
of the teacher and how that may change. 


° Research and Effectiveness—papers which tell us 
what has been found regarding the effectiveness of 
cooperative learning and how that compares with 
traditional pedagogical approaches. 


e Implementation Issues—papers that focus on issues 
specific to the implementation of cooperative 
learning into an overall pedagogical approach. 


Anyone who is interested in developing a deep 
understanding of cooperative learning will find much 
of interest in this volume. References at the end of 
each selection and an extended bibliography point to 
further readings. 
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"mene, | bhne Contest Problem Book V 


GEORGE BERZSENY! ond STEPHEN B 7 MALIAEY. 


PROBLEM BOOK V 


The Mathemstinnt Aensciation of Amerk« 
Nes Mathametion Luar, | 


Over the years perhaps the most popular of the 
MAA problem books have been the high school 
contest books, covering the yearly American 
High School Mathematics Examinations (AHSME) 
that began in 1950, co-sponsored from the start 
by the MAA. Book V also includes the first six 
years of the American Invitational Mathematics 
Examination (AIME) which was developed as an 
intermediate step between the AHSME and the 
USA Mathematical Olympiad (USAMO). The 
AIME has a unique answer format — all answers 
are integers between 0 and 999. 


The editors of this volume, George Berzsenyi and 
Stephen B Maurer, were respectively the chair of 
the AIME and the AHSME during this period. In 
addition to a thorough index, they have added 
much material not included in Contest Books I-IV: 
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American High School Mathematics Examinations and 
American Invitational Mathematics Examinations, 1983-1988 


Series: New Mathematical Library 


George Berzsenyi and Stephen B Maurer 


a comprehensive guide to other problem 
materials world wide, 

additional solutions, 

dropped problems, 

statistical information, 

information on test development and history. 


This volume is a must for avid fans of elemen- 
tary problems. 


Contest Books I-IV appear as NML volumes 5, 
17, 25, and 29. 
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Interdisciplinary Lively Application Projects (LAPs) 
are small group projects developed through the 
cooperation of faculty from mathematics and partner 
disciplines. These projects will provide teachers with 
material that can help their students understand 
mathematical concepts, develop strong mathematical 
skills and motivate them towards an interest in future 
subjects accessible through the study of mathematics. 
It is an important step towards helping students 
acquire a broad, interdisciplinary outlook towards 
problem solving. 


The ILAPs provide supplemental classroom resource 
materials in the form of eight project handouts that 
you can use as student homework assignments. They 
require students to use scientific and quantitative rea- 
soning, mathematical modeling, symbolic manipula- 
tion skills, and computational tools to solve and ana- 
lyze scenarios, issues, and questions involving one or 
more disciplines. Sample solutions to the problems, 
background material, notes to the instructor and a stu- 
dent writing guide are also included. 


The prerequisite skills for the eight projects present- 
ed in the book range from freshmen-level algebra, 
trigonometry, and precalculus; through calculus, ele- 
mentary and intermediate differential equations, and 
discrete mathematics to advanced calculus and partial 
differential equations. The partner disciplines includ- 
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Interdisciplinary Lively 
Application Projects (ILAPs) 


Series: Classroom Resource Materials 


ed in the projects are: mechanics, physics, chemistry, 
engineering, geography, topography, and exercise 
physiology. You can use the projects as a supple- 
ment to a textbook in any of the following under- 
graduate areas: precalculus, calculus, linear algebra, 
differential equations, discrete mathematics, mathe- 
matical modeling, advanced calculus, partial differen- 
tial equations, and numerical computing. 


The book also contains several supporting articles 
that describe uses for these projects. 


Contents: TILAPs: Getting Fit with Mathematics; 
Decked Out; Parachute Panic; Flying with Differential 
Equations; Planning a Backpacking Trip to Pikes 
Peak; SMOG in Los Angeles Basin, Structural 
Mechanics — Beams and Bridges; Contaminant 
Transport. Articles: Technical Report Format and 
Writing Guide, Project INTERMATH: An Interdiscipli- 
nary Approach to Cultural Change; ILAP Products: 
Authoring, Testing and Editing, Interdisciplinary and 
Integrated Curriculum Models; Interdisciplinary Com- 
munication and Understanding; Interdisciplinary Pro- 
jects at West Point. 
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Julia a life in mathematics 


Constance Reid 


Constance Reid, an established writer about mathematicians, has 
written an excellent and loving book, about her sister fulia 
Robinson, the mathematician. The author has written that she 
was the book to be one for all age groups and she bas succeeded 
admirably in making it so...Julia wanted to be known as a mathe- 
matician, not a woman matbematician and rightly so! However, 
she was, and is, a wonderful role model for women aspiring to be 
mathematician. What a great gift this book would be! 

—Alice Schafer, Former President, AWM 


This book is a small treasure, one which I want to share with all 
my mathematical friends. The assembly of several articles and 
additional photos and remarks provides the image of a mathe- 
matician of extraordinary taste. tenacity and generosity.... Julia 
Robinson broke ground in displaying the deep connections 
between number theory and logic. Her results have led to a 
Lery active area today, making the appearance of this book very 
timely. Her work and ber example are bowever timeless and I 
can think of no better advice to give a young matbeniatician, 
ettber in bow to do mathematics. or how to bebave in mathe- 
matics, than: “Be like fulial” | 

—Carol Wood, Deputy Director, MSRI 


Julia is the story of the life of Julia Bowman Robinson, the gift- 
ed and highly original mathematician who during her lifetime 
was recognized in ways that no other woman mathematician 
had been recognized up to that time. In 1976 she became the 
first woman mathematician elected to the National Academy of 
Sciences and in 1983 the first woman elected president of the 
American Mathematical Society. 


This unusual book, profusely illustrated with previously 
unpublished personal and mathematical memorabilia, brings 
together in one volume the prizewinning “Autobiography of 
Julia Robinson” by her sister. the popular mathematical 


biographer Constance Reid, and three very personal articles 
about her work by outstanding mathematical colleagues. 


All royalties from sales of this book will go to fund a Julia 
Robinson Prize in Mathematics at the high school from 
which she graduated. 
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American Mathematical Society 


African Americans in 


Mathematics 


Nathaniel Dean, Bell Laboratories, 
Murray Hill, NJ, Editor 


This volume contains research and expository 
papers by African-American mathematicians 
on issues related to their involvement in the 
mathematical sciences. Little is known, taught, 
or written about African-American mathemati- 
cians. Information is lacking on their past and 
present contributions and on the qualitative and 
quantitative nature of their existence in and dis- 
tribution throughout mathematics. This lack of 
information leads to a number of questions that 
have to date remained unanswered. This vol- 
ume provides details and pointers to help 
answer some of these questions. 


DIMACS: Series in Discrete Mathematics and 
Theoretical Computer Science, Volume 34; 
1997; 205 pages; Hardcover; ISBN 0-8218-0678-5; 
List $49; Individual member $29; Order code 
DIMACS/34MM711 


Discrete Mathematics in 
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This volume is co-published with the National 
Council of Teachers of Mathematics (NCTM), 
Reston, VA. 

DIMACS: Series in Discrete Mathematics and 
Theoretical Computer Science, Volume 36; 1997; 
452 pages; Hardcover; ISBN 0-8218-0448-0; List $30; 
All AMS members $24; Order code 
DIMACS/36MM711 


-| All prices subject to change. Charges for delivery are $3.00 per order. For air delivery outside of the continental U. S., please include $6.50 per 
| item. Prepayment required. Order from: American Mathematical Society, P.O. Box 5904, Boston, MA 02206-5904. For credit card orders, fax (401) 
| | 455-4046 or call toll free 800-321-4AMS (4267) in the U.S. and Canada, (401) 455-4000 worldwide. Or place your order through the AMS book- 

‘| store at http:/ /www.ams.org/bookstore/. Residents of Canada, please include 7% GST. 


Recently Published by the AMS 


Introduction to Probability 


Second Revised Edition 


Charles M. Grinstead, Swarthmore 
College, PA, and J. Laurie Snell, 
Dartmouth College, Hanover, NH 


This text is designed for an introductory proba- 
bility course at the university level for sopho- 
mores, juniors, and seniors in mathematics, 
physical and social sciences, engineering, and 
computer science. It presents a thorough treat- 
ment of ideas and techniques necessary for a 
firm understanding of the subject. 


The text is also recommended for use in discrete 
probability courses. The material is organized so 
that the discrete and continuous probability dis- 
cussions are presented in a separate, but paral- 
lel, manner. 

1997; 510 pages; Hardcover; ISBN 0-8218-0749-8; 
List $49; All AMS members $39; Order code 
IPROBMM/711 


Mathematics and 
Mathematicians 


Mathematics in Sweden before 
1950 


Lars Garding, Lund University, Sweden 


This book is about mathematics in Sweden 
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to M. Riesz, T. Carleman, and A. Beurling. It 
tells the story of how continental mathematics 
came to Sweden, how it was received, and how 
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Important results are analyzed and re-proved in 
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GEOMETRIC CONSTRUCTIONS 


Geometric constructions have been 
a popular part of mathematics 
throughout history. The ancient 
Greeks made the subject an art, 
which was enriched by the med- 
ieval Arabs, but which required the 
algebra of the Renaissance for a 
thorough understanding. Through 
coordinate geometry, various geo- 
metric construction tools can be associated with var- 
ious fields of real numbers. This book is about these 
associations. The author writes in a charming style and 
nicely intersperses history and philosophy within the 
mathematics. He hopes that readers will learn a little 
geometry and a little algebra while enjoying the effort. 
This is as much an algebra book as it is a geometry 
book. 
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It is written for trainers and participants of contests of 
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complete, but some merely point to the road leading 
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GEOMETRY: PLANE AND FANCY 


Geometry: Plane and Fancy offers students a fasci- 
nating tour through parts of geometry they are unlike- 
ly to see in the rest of their studies while, at 
the same time, anchoring their excursions to the well 
known parallel postulate of Euclid. The author shows 
how alternatives to Euclid’s fifth postulate leads to inter- 
esting and different patterns and symmetries. 
In the process of examining geometric objects, the 
author incorporates the algebra of complex (and hyper- 
complex) numbers, some graph theory, and some 
topology. Nevertheless, the book retains its elemen- 
tary integrity. Readers are assumed to have had a course 
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While many concepts introduced are advanced, the 
mathematical techniques are not. Singer’s lively expo- 
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are nicely scattered throughout. The contents of the 
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Plane * Geometry of the Sphere * More Geometry of 
The Sphere ¢ Geometry of Space 
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NOTICE TO AUTHORS 


The MONTHLY publishes articles, as well as notes and 
other features, about mathematics and the profes- 
sion. Its readers span a broad spectrum of mathe- 
matical interests, and include professional mathe- 
maticians as well as students of mathematics at all 
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Ramanujan’s Association with Radicals 
in India 


Bruce C. Berndt, Heng Huat Chan, and Liang—Cheng Zhang 


In memory of Ramanujan on the 


: 1/6 
146410001 \? 146410001 146410001 \° 146410001 \ \ 
32 — 6 + 32 — 6 — } th 
48400 48400 48400 48400 


anniversary of his birth. 


Ramanujan (sometimes under the alternative spelling Ramanujam) submitted 
58 problems to the Journal of the Indian Mathematical Society. Approximately ten 
of them involve equaliti¢és between radicals. For example [16, p. 334], 


5 5 1/35 5 5 

Establishing equalities among exotic radicals was very common in Ramanuyjan’s 
day, especially in Great Britain and its empire. For example, see the then popular 
texts by H. S. Hall and S. R. Knight [12, Chap. 8] and G. Chrystal [10, Chap. 11], 
the latter being well-known to Ramanujan. Was Ramanujan’s keen interest in 
radical equalities merely a consequence of their popularity in his time, or were 
there other reasons? The answer can be found in his notebooks [15] and in one of 
his most important papers [14], [16, pp. 23-39]. 

Scattered among the pages in Ramanujan’s first notebook are the values of 107 
class invariants, or polynomials satisfied by them. As we shall see, these invariants 
frequently take the shapes of interesting radicals, and often to put the radical 
expressions in their most attractive forms, difficult radical equalities need to be 
established. So that we may define Ramanujan’s class invariants, set 


oe 


x(q) = [1G +q4**"'). 


For any positive rational number 7, set 


q = exp(— vn), 
and define the two class invariants G,, and g,, by 
G2 Vg x(q) and 4g, = 2-V4q"/4y(-g). (1) 


As we shall see, we are able to calculate G, for certain odd values of n and g, for 
certain even values of n. 

At the beginning of the twentieth century, these invariants were extensively 
studied by H. Weber [21], who used the notations G, =:2~!/4f(V—n) and 
g, = 27'/4¢(/— n). Weber [21] proved that G, and g, are algebraic. In fact, G,, 
2°'/?G, and 2-'/*G, are units in some algebraic number field according as 
n = 1 (mod 4), n = 3 (mod 8), and n = 7 (mod 8), respectively. If n = 2 (mod 4), 
then g,, is a unit. Weber’s study of G, and g, was motivated by the construction of 
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the Hilbert class field H,, the maximal unramified abelian extension of the imagi- 
nary quadratic field K, := Q(V— 7). It can be shown that H, = K,(j(w,)), where 


V—n, if n = 1 (mod 4), 
O,=) 3+Vv-nN 
a if n = 3 (mod 4), 


and j is the famous modular j-invariant, so-called because it is invariant under 
transformations from the modular group. Weber [21] asserted that certain small 
powers of G, and g, can be used to replace j(w,,) as generators of H,, over K,,. 
Perhaps for these reasons, Weber called G,, and g, class invariants, and computed 
a total of 105 class invariants or the monic, irreducible polynomials satisfied by 
them. The excellent text of D. A. Cox [11] provides an accessible account of 
Weber’s work on invariants. 

Before proceeding further, we give some examples that Ramanujan calculated: 


1+ V5 


15+ 17 | ¥17 v17 3 
17 ~ — 3. + 
5 + 23 ooh 2+ 3y3 v° 
Se) (PS) | 5 aa i 
3 


“15 4 By)" 11 + y123 ) (ses ) 


(2) 


69 


v2 v2 v2 


1/2 

117 + 3733 125 + 3733 
x | \/ ————__ +. 1/ 

8 8 
1/2 
| | 561 + 99733 | 569 + 99733 | 
x | |) —————————__ + J 
8 8 


The value of G,. was only recently verified for the first time by the authors [7]. 
In our calculation of G,,, we used the equality 


(188 + 108y3 + V (188 + 108V3)° —1 yo o* x8 + o* ae 


which is the special case, a = (4 + 3¥3)/4, of the more vonera equality 


(320° — 6a + Vy (32a° - 6a)” — 1)" =ya+yz+ya-3, 


which we also used in the dedication at the beginning of this paper and in 
calculations of further class invariants. 

The value of G,,;, was communicated by Ramanujan [16, p. xxix, eq. (23)], [8, 
p. 62] in his second letter, dated 27 February 1913, to G. H. Hardy and was first 
established (unrigorously) by G. N. Watson [18]. In a letter of 1 October 1930 to 
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B. M. Wilson [8, pp. 237, 238], Watson confided, “...but 23 which deals with the 
singular modulus associated with 1353 is included; I was pleased at getting this out, 
because the bulk of the singular moduli in the Notebooks can be obtained in the 
same way ... You will be interested to hear how Ramanujan got no. 23, 
particularly when you look at the length of the answer. I am absolutely convinced 
that he guessed it.” (Calculating a singular modulus, which we do not define here, 
is equivalent to calculating a class invariant.) The reader is undoubtedly astonished 
to learn that Ramanujan first “guessed” his formula for G,,,,. We do not agree 
with Watson! We think that Watson’s proof, which is not rigorous, could not have 
been given without his knowing the formula in advance. The first rigorous proof 
was given recently by Chan [9]. 

On pages 294-299 in his second notebook [15], Ramanujan gave a table of 
values for 77 class invariants, three of which are not found in the first notebook. 
Since the second notebook is an enlarged revision of the first, it is unclear why 
Ramanujan failed to record 33 class invariants that he offered in the first 
notebook. Four further results are found in scattered places in the second 
notebook. After arriving in Cambridge, Ramanujan learned of Weber’s work [21], 
and so when he wrote his paper [14], [16, pp. 23-39], the table of 46 class 
invariants that he included did not contain any that are found in Weber’s 
book [21]. Except for G;,; and G;,,, all of the remaining values are found in 
Ramanujan’s notebooks. To the best of our reckoning, Ramanujan calculated a 
total of 116 class invariants, or monic, irreducible polynomials satisfied by them. 

Why did Ramanujan calculate such a large number of class invariants? Ramanu- 
jan did not share Weber’s interest in generating Hilbert class fields, but he did 
have applications. First, as the title of his paper [14] indicates, Ramanujan used 
class invariants to find excellent approximations to 7. For example, from (1) and 
(3), we find that 


24 
T = Teg (08Go» + slog2) = 3.1415926536032..., 
which agrees with the value of 7 through nine decimal places. 
Second, Ramanujan used class invariants to determine explicitly particular 
values of the theta function ¢y(q) defined by 


9(q) = y gq’. 
k=-x 


For example, Ramanujan probably used his value of G,, to show that [3] 


g(e)  yl3+Vv7 +7 +3v7 


ple") id (28)'". (4) 


The value 
| - ’ a l/s 
g(e ")= wa 
l'(}) 


is well known [1, p. 103], and so (4) provides an explicit evaluation for g(e~ ’”). 


The theta function ¢ is intimately connected with elliptic functions and inte- 
grals. An elliptic function is a function of a complex variable with two linearly 
independent periods, in contrast to the familiar trigonometric functions, which 
have just one linearly independent period. The complete elliptic integral of the 
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first kind associated with the modulus k, 0 < k < 1, is defined by 


R= K(K) = i _ V1 — k2sin20 ©) 


The complementary modulus k’ is defined by k’ = V1 — k?; set K’ = K(k’). If 
q = exp(—7K'/K), then one of the central theorems in the theory of elliptic 
functions asserts that 


(q) = | me ° K(k) (6) 
~ = — ———_—_!_ FT . 
4 7™J, vi — k*sin’0 7 


By (6), an explicit determination of ¢(q) for a certain value of gq also yields an 
explicit value for K(k). 
A second classical theta function is the Dedekind eta-function 7(z) defined by 


ox 


f(-a) =a" nz) = TT (1 4") = y (-1)* gk? (7) 


k= k=-x 


where g = exp(27iz) and |q| < 1. The exponents k(3k — 1)/2 are called penta- 
gonal numbers, and the second equality in (7) constitutes Euler’s pentagonal 
number theorem. From (1) and (7), we easily see that 


G, = 2-/4g- 4 fq) -1/4 g-1/24 f(-4) 

" f(-4°) f(-4’) 
when g = exp(— Vn). Ramanujan likely used his values of class invariants to 
calculate explicitly certain products of eta-functions in both his first and lost 


notebooks [15], [17]. For example, he probably used the values of G,,; and G4 
to prove that 


and g,=2 (8) 


on ,f(-e°"”) a+b 
° Faery = rt (9) 


where a = (60)'4 and b = 2 - y3 + V5. 
Ramanujan also used class invariants to determine values of the celebrated 
Rogers-Ramanujan continued fraction R(q), defined by 


1/5 2 3 


q q 4 q 


=o FO <1. 
RO 747 e Tee? (Wis! 


The behavior of R(q) for |g| = 1 is not completely understood, but if you have had 
a course in elementary number theory, perhaps you have shown that 


5 -1 
= 


R(1) = 


Using the value of G., given in (2), we can show that 


2 
5/5 +11 5¥5 + 11 
R5(e027/V5) = | [a +1-—>—. 
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To offer another example [4], Ramanujan undoubtedly used (9) to show that 


R(e °") =Vce* +1 -Cc, 


where 


Ramanujan calculated several further values of R(q) in his lost notebook [17], and 
many of these can be found in the authors’ paper [6]. 

In the remainder of the paper, we briefly describe some attempts and methods 
used to establish Ramanujan’s class invariants. 

In two papers [19], [20], Watson proved the 24 class invariants from Ramanujan’s 
paper [14] that cannot be found in Ramanujan’s second notebook. In the first [19], 
Watson devised an “empirical process” to calculate 14 of the 24 invariants, while in 
the second [20], he employed modular equations, which we define later in this 
paper, to prove 10 invariants. Watson [18] also used his empirical process to 
establish Ramanujan’s value for G,,;,. In the introduction to [19], Watson re- 
marked, “It is intended to publish the calculations involved in the construction of 
the set N + Q (the invariants appearing in both Ramanujan’s paper [14] and the 
second notebook) as part of the commentary on the note-books by Dr. B. M. 
Wilson and myself.” Although Watson and Wilson’s efforts to edit Ramanujan’s 
notebooks have been preserved in the library at Trinity College, Cambridge, 
Watson’s calculations of these twenty-one invariants are not found there. The 
twenty-one values of n are: 65, 69, 77, 81, 117, 141, 145, 147, 153, 205, 213, 217, 
265, 289, 301, 441, 445, 505, 553, 90, and 198. Watson wrote four further papers on 
the calculation of class invariants, and in those he verified three additional class 
invariants determined by Ramanujan, namely, those for n = 81, 147, and 289. 
Thus, after Watson’s work, and up until recent times, 18 of Ramanujan’s class 
invariants remained to be verified. 

For five of the values, 1 is a multiple of 9, namely, n = 117, 153, 441, 90, and 
198. The authors [5] found proofs for these values by using formulas relating G4, 
with G, and go, with g,, which we established by using one of Ramanujan’s 
modular equations of degree 3. All of the remaining 13 values are for G,, n = 65, 
69, 77, 141, 145, 205, 213, 217, 265, 301, 445, 505, and 553. Note that each value of 
n is the product of a small prime (3, 5, or 7) and a larger prime. Quite remarkably, 
the class number for each of the 13 imaginary quadratic fields Q(y — n ) equals 8. 
Moreover, there are precisely two classes per genus in each case. This is amazing! 
It is extremely unlikely that Ramanujan had any knowledge of imaginary quadratic 
fields, genus theory, or class numbers. However, Ramanujan must have recognized 
some arithmetical properties shared by these fields, although he would have 
expressed his ideas in a language very different from what we use today. How did 
Ramanujan calculate these 116 class invariants? He left no clues in his notebooks. 
Since Weber’s methods were highly algebraic, it is very unlikely that Ramanujan 
journeyed along Weber’s paths. 

In his paper [14], Ramanujan used modular equations to calculate only a couple 
of simple invariants. This fact and the sentence, “The values of G, and g,, are got 
from the same modular equation.” [14], [16, p. 25] are the only clues to his methods 
that Ramanujan provided for us. It would seem that if Ramanujan had employed 
another type of reasoning, he would have dropped some hint about it. 

Having mentioned modular equations three times already in this paper, a 
definition of a modular equation is overdue. 
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With the elliptic integral K defined by (5), let K, K’, L, and L’ denote complete 
elliptic integrals of the first kind associated with the moduli k,k’,/, and 
I’ = v1 —1’, respectively, where 0 < k, 1 < 1. Suppose that 

K’ L 0 

ne 7 (10) 
for some positive integer n. A relation between k and / induced by (10) is called a 
modular equation of degree n. In fact, modular equations are algebraic equations. A 
modulus can be expressed in terms of classical theta functions. Although we 
suppress all details, this fact, (10), and (6) are the primary ingredients needed to 
show that a modular equation can alternatively be expressed as an identity relating 
theta functions with argument q and theta functions with argument q”. 

As mentioned above, Watson [20] used modular equations to establish some of 
Ramanyjan’s invariants. We have been able to prove six of the remaining thirteen 
values for G,, namely, for n = 65, 69, 77, 141, 145, and 213, by using modular 
equations of degrees p and q, where n = pq, but our approach is necessarily 
different from that of Watson. To prove the remaining seven invariants by 
employing modular equations, we would need modular equations of degrees 31, 41, 
43, 53, 79, 89, and 101. Apparently, only for degree 31 did Ramanujan derive a 
modular equation, for he recorded no modular equations for the other six degrees 
in his notebooks. Some of the modular equations that we employed are very 
complicated, and so we had to use Mathematica to effect some of our calculations. 
In conclusion, it seems unlikely that Ramanujan used only modular equations in 
these elusive computations. 

In order to prove the remaining class invariants of Ramanujan, we devised two 
methods [7]. 

The first uses Kronecker’s limit formula. In order to give a brief description of 
this formula, we need to define the Epstein zeta-function. Let Q(u, v) = y"'(u + 
uzu + vz), where z=x +iy with y > 0. The Epstein zeta-function {,(s) is 
defined for o = Re s > 1 by 


fo(s) = L{Q(u,v)} 

where the sum is over all pairs of integers (u, v) except (0, 0). It is well known that 
f(s) can be analytically continued to the entire complex s-plane, where ¢,(s) is 
analytic except for a simple pole at s = 1. The Kronecker limit formula provides 
the constant term in the Laurent expansion about s = 1. This constant term 
involves the Dedekind eta-function, which we defined in (7). The Kronecker limit 
formula then leads to representations for certain products of Dedekind eta-func- 
tions in terms of fundamental units. By (8), these representations allow us to 
calculate G,. Our methods extend those of K. G. Ramanathan [13] who calculated 
some of Ramanujan’s class invariants but required that Q(V — 7 ) contains only one 
class per genus. Zhang [22], [23] has further extended the method to give rigorous 
proofs of the invariants of Ramanujan that Watson [19] had “empirically” calcu- 
lated. 

Our second method takes Watson’s ideas and employs class field theory to put 
the “empirical” process on a firm foundation [7]. It has been further extended by 
Chan to determine several new invariants [9]. 

It is highly doubtful that Ramanujan had any acquaintance with Kronecker’s 
limit formula, the arithmetic of quadratic fields, or class field theory. Thus, 
Ramanuyjan’s ideas still remain hidden behind an opaque curtain. 
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In this paper, we have made many claims without proofs (as did Ramanujan), 


but complete proofs or references for all our assertions can be found in [2]. 
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Ramanujan, Taxicabs, Birthdates, 
ZIP Codes, and Twists 


Ken Ono 


Dedicated to the memory of S. Ramanujan on the 110th anniversary of his birth. 


It is well known that G. H. Hardy travelled in a taxicab numbered 1729 to an 
English nursing home to visit his bedridden colleague S. Ramanujan. Hardy was 
disappointed that his cab had such a mundane number, but to his surprise when he 
mentioned this to Ramanujan, the brilliant Indian mathematician found 1729 to be 
quite interesting, for it is the smallest integer that has two distinct representations 
as a sum of two cubes: 

1729 = 1 + 12? = 9° + 10°. 
J. H. Silverman used this famous anecdote to motivate the study of elliptic curves 
in a recent article [8]. 

Recently I learned that other permutations of the digits 1, 2, 7, and 9 are 
significant to the Ramanujan story. Two permutations involve Bruce Berndt, the 
diligent editor of Ramanujan’s notebooks. Bruce has devoted most of his profes- 
sional career to undertaking the daunting task of proving many of Ramanujan’s 
identities (written in notebooks without proofs), but to my surprise his fascination 
with Ramanujan has profoundly impacted his life outside mathematics. Sonya, 
Bruce’s youngest daughter, was born in 1972. Is this a coincidence, or could it be 
an example of “Ramanujan family planning”? With more sleuthing I discovered 
that Bruce’s home is in Urbana, Illinois 61802-7219. Could there be any truth to 
the rumor that Bruce paid the postmaster a mere $12.79 for this vanity zipcode? 

In a more serious direction, consider the number 2719, which came to my 
attention in joint work with K. Soundararajan [5]. We begin with the following 
footnote from Ramanujan’s 1916 paper on quadratic forms [6, p. 14]: 


ce 


. the even numbers which are not of the form x* + y* + 10z? are the numbers 
4*(16u + 6), 
while the odd numbers that are not of that form, viz., 


3,7, 21,31, 33, 43, 67, 79, 87, 133, 217, 219, 223, 253, 307, 391... 


do not seem to obey any simple law. 


In view of the list of exceptions, could there be a “simple law” that eluded 
Ramanujan? After extensive computation, amongst the odd integers two further 
exceptions emerged, the numbers 679 and of course 2719. A few years ago 
W. Duke and R. Schulze-Pillot [3] (see [2] for a survey) made a great breakthrough 
in the theory of ternary quadratic forms, and from their work it follows that there 
are only finitely many positive odd integers that are not of the form x* + y* + 10z?. 
Could it be that 2719 is the largest such integer? 
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Unfortunately we do not yet know enough to decide whether or not it is since 
they obtained no bound beyond which every odd integer is so represented. 
Although obtaining such a bound appears to be beyond the current state of 
knowledge, assuming certain Riemann hypotheses, the author and Soundararajan 
[5] have shown that the only positive odd integers that are not of the form 
x* +y* + 10z* are indeed 679, 2719, and the 16 numbers on Ramanujan’s list. 
Therefore we have very good reason to believe that 2719 is the largest odd integer 
that is not of the form x? + y? + 10z?. 

Here we explore the special properties that these eighteen integers share. 
Obviously they are odd numbers n for which there are no integers x, y, and z with 

=x? +y* + 10z’, and we even know that they are all square-free (see [1], [5,Th. 
1]) and coprime to 10, but these numbers are linked for much deeper reasons 
involving some of the most fundamental objects in algebraic number theory and 
arithmetic geometry. Let me explain. 

Following C. F. Gauss, any collection of equivalence classes of ternary quadratic 
forms that represent the same residue classes (mod M) for every M is called a 
“genus.” In our case, the genus containing Ramanujan’s ternary quadratic form 
x* + y? + 10z* contains only one other class, and a representative for this class is 
the form 2x? + 2y* + 3z* — 2xz. For convenience define r,(n) and r,(n) by 


r(n):= #{(x, y, z) 
r(n):= #{(x, y,z)|x,y,z EZ, 2x? + 2y? + 3z* — 2xz =n}. 


x,y,zEZ, x* +y? + 10z” =n}, 


Therefore, Ramanujan wanted a rule for determining those odd vn for which 
r(n) = 0. 

To see the utility in considering both forms together recall Gauss’ Three 
Squares Theorem. Let h(D) denote the number of classes of primitive binary 
quadratic forms with discriminant D, the usual “class number,” and let r(n) 
denote the number of representations of n by x? + y? + z*. If n > 3 is square-free, 
then 


_ ({12h(-4n) if n = 1,2,5,6 (mod 8), 
MY) = \oan(—n) ifn =3 (mod 8). 


More generally, Gauss obtained formulas for the number of representations of 
integers by genera, and in the case of Ramanujan’s form, if 7 is a positive 
square-free integer coprime to 10, then 


r(n)/2+7r,(n) =h(—40n). 


Therefore if n is a positive odd integer that is not of the form x* + y* + 10z?, 
then 
r,(n) = h(—40n). (1) 


It is also useful to consider the differences r,(m) — r,(n). To do so, define 


fl2)e= FD (n(n) = ra(n))a" 


Rl eR 


=q-q-q'-q +2q? +-+(q:=e°7" with Im(z) >0). (2) 


This function f is a “weight 3/2 modular form.” An analytic function m(z) on the 
upper half of the complex plane is a modular form of weight k if for each suitable 
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matrix (: A  SL,(Z) there exist roots of unity e(d) for which 


ae 
m cz +d 


| = e(d)(cz + d)“m(z). 
To study r,(n) — r,(n) we employ the Shimura lift [7], a beautiful correspon- 


dence between certain half-integral weight modular forms and integral weight 
modular forms. In this case if integers A(n) are defined by 


= A(n) 1 (ea {s H(t) | 


x(n) 


nN 


n=1 


— 10 
where y denotes the Legendre-Kronecker quadratic character y(n) := (—}. 


and if g and z are as in (2), then 


F(z) = Y Atma" =a T= (0 9) 


2 


=q-2q?-q+2q'+q +2qh+-- (3) 
is a weight 2 modular form. 

The modular form F(z) provides an example of the celebrated Shimura- 
Taniyama Conjecture, whose proof in special cases by A. Wiles yields Fermat’s 
Last Theorem. The conjecture asserts that the coefficients of certain weight 2 
modular forms, the A(7), equal the coefficients of L-functions of elliptic curves. In 
this case let E denote the elliptic curve over the rational numbers 


E: =x 4x2 + 4x 4+ 4. 


For each odd prime p let N(p) denote the number of pairs, x (mod p), y (mod 
p), that satisfy the congruence 


y? =x +x? + 4x + 4(mod p). 


If c(p) = p — N(p), then the Hasse-Weil L-function L(E,s) is defined by the 
following product over all the odd primes: 
* tn 1 
L(E,s) = x ——_—_—————- 

per l—c(p)p’ +p” 
1 2 1 2 1 2 4 
=1-—-~—+2—+4+—-—4+—+-. 

38 5ST OF 88 (4) 

By comparing (3) and (4) one sees that A(n) = c(n) for each n < 13. In fact this 
equality holds for every positive integer n, and is an example of the phenomenon 
described by the Shimura-Taniyama Conjecture. 

Are these observations relevant to Ramanujan’s query? They are, and the 
answer lies in the work of J.-L. Waldspurger [10] who provided a very deep and 
beautiful interpretation of Shimura’s lift. In our case let n be a positive odd 
square-free integer, and define the —10n quadratic twist L(E(—10n), s) by 


1 
10n 


L( E(-10n),s) = 


p#2.5 1 — e( p)(— 


This is the L-function for the elliptic curve E(—10n) 
E(-10n): y? =x? — 10nx* + 400n’x — 4000n°. 


\p + p\~?s 
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If 


~ 0.7195..., 


x 1 
o= [7 ——_1 
10 Vx? — 10x~ + 400x — 4000 


then for every odd square-free integer n # 5 Waldspurger’s theorem implies 


»_ An 
(7,(n) —7,(n)) = a -L( E(-10n),1). 


Therefore, by (1), if 1 is a positive odd integer that is not of the form x? + y* + 
10z’, then 


4vn 
h*(—40n) = “HEC EC 102), 1). (5) 


Although (5) is a “law” that the odd integers not of the form x” + y~ + 10z” obey, 
it certainly is not a simple one. However, its formulation is particularly intriguing. 

First we recall some facts about elliptic curves. Let C denote the set of rational 
points (x, y) satisfying 


C: yi =x? +ax*+bx +c 


where a, b and c are fixed rational numbers. If the discriminant of x* + ax” + bx 
+ c is non-zero, then Mordell proved that C, including a “point at infinity,” forms 
a finitely generated abelian group whose group law is a “chord-tangent” law (see 
[9]). Therefore, 


C=C x LZ’ 


torsion 


where C,,,<o,, the torsion subgroup of C, is a finite abelian group, and the rank r 
is a non-negative integer. Note that C has finitely many points if and only if r = 0. 
Quite a bit is known about C By a theorem of Mazur it is known that C 
satisfies 


torsion ° torsion 


C - Zm |where 1 < m < 10, orm = 12, 
fOrsion Z2XZ2m \where1 <m <4 


(Zd denotes the cyclic group with d elements), and with this classification it is 
fairly easy to deduce C,,,.,., for any given C. 

Computing r is a more difficult question, and although one can typically 
compute r in practice, the problem in general remains open. In part these 
problems revolve around the Birch and Swinnerton-Dyer Conjecture, which asserts 
that the analytic behavior of L(C,s) at s = 1 predicts the structure of C, in 
particular r. In its weakest form the conjecture asserts that L(C, s) has an analytic 
continuation to the entire complex plane, and that r equals the order of vanishing 
at s=1 of L(C,s). In particular, C has finitely many points precisely when 
L(C, 1) # 0. 

For a “modular” elliptic curve C, one satisfying the Shimura-Taniyama Conjec- 
ture, V. Kolyvagin [4] proved that C has finitely many points if L(C, 1) # 0. 
Therefore by the positivity of h(—40n), (5), and Kolyvagin’s theorem, if n is a 
positive odd integer that is not of the form x” + y* + 10z*, then E(-—10n) has 
finitely many points. In fact if n equals 679, 2719, or any of the 16 numbers on 
Ramanujan’s list, then the only rational point (x, y) on E(— 10n) 


y? =x? — 10nx* + 400n?x — 4000n°, 
is (107, 0). 
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In its full strength the Birch and Swinnerton-Dyer Conjecture predicts even 
more. If L(C,1) #0 the conjecture predicts that L(C,1) is an explicit real 
multiple of the order of HI(C), the Tate-Shafarevich group of C, which measures 
the extent to which the “local-global principle” fails for an elliptic curve C. Recall 
that a conic has a point with rational coordinates precisely when it contains a point 
with real coordinates and a point with coordinates that are p-adic numbers for 
every prime p. However this is not true for elliptic curves. In a famous example, E. 
Selmer noted that there are no non-trivial rational points on 


3x° + 4y? + 5z° =0, 


even though it has points over every field of p-adic numbers. The Tate-Shafarevich 
group measures the failure of this principle. 

In our case if n is 679, 2719, or one of the 16 integers on Ramanuyjan’s list, then 
(5) and the Birch and Swinnerton-Dyer Conjecture imply 


h?(-—40n) = 4*'| I E(-10n))| (6) 


where f(n) denotes the number of prime factors of n. 

Just as Tate-Shafarevich groups measure the obstruction to the “local-global” 
principle for elliptic curves, the set of classes of discriminant D primitive binary 
quadratic forms, denoted by CL(D), measures an obstruction. The set CL(D) is an 
abelian group with order h(D) that is isomorphic to the “ideal class group” of 
Q(VD ), and it measures the extent to which unique factorization fails in the ring of 
integers of Q(VD ). For instance, in the ring of integers of Q(V— 40), numbers 
of the form a + b¥— 10 with a,b &€ Z, the integer 14 does not factor uniquely 
into irreducibles since it has the following factorizations: 


144=2-7=(2+v-10)-(2-—v-10). 
By Gauss’ genus theory the index of the subgroup 
CL?(—40n) = {a’|a € CL(-—40n)} 


in CL(—40n) is 2'*'. Therefore for the known odd integers not of the form 
x* +y* + 10z’, (6) and the Birch and Swinnerton-Dyer Conjecture imply the 
following tantalizing equality relating class groups and Tate-Shafarevich groups: 


|CL?(-—40n) x CL?(-—40n)| =|I( E(-10n))}. (7) 


From our discussion, Ramanujan’s search for a “simple law” leads to several deep 
theorems and conjectures in arithmetic geometry. To reiterate, if n equals 679, 
2719, or one of the integers on Ramanujan’s list, then using Shimura’s lift, the 
Shimura-Taniyama correspondence, and the works of Kolyvagin and Waldspurger, 
we have obtained the following gems: 


(i) There are no rational numbers x and y with y # 0 for which 
y? =x? — 10nx? + 400n?x — 4000n3. 
(ii) Assuming the Birch and Swinnerton-Dyer Conjecture, 


|CL?(—40n) x CL?(—40n)| =| I( E(-10n))]. 


There are a few other ternary forms that also have such elegant properties, but 
most do not. This illustrates again how Ramanujan’s deep insight continues to 
thrive beyond his centenary. By the way, Ramanujan’s lifespan was 1887-1920. 
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Whenever I am angry or depressed, I pull down [his] collected 
papers from the shelf and take ‘a quiet stroll in Ramanujan’s 
garden. I recommend this therapy to all of you who suffer from 
headaches or jangled nerves. And Ramanujan’s papers are not only 
a good therapy for headaches. They also are full of beautiful ideas 
which may help you to do more interesting mathematics. 


Freeman Dyson, Selected Papers, 
American Mathematical Society, Providence, 1996, p. 205 | 
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Simplicity and Surprise in Ramanujan’s 
“Lost” Notebook 


George E. Andrews 


1. INTRODUCTION. In 1979, I wrote an introduction [2] to Ramanujan’s “Lost” 
Notebook [7]. In that introduction I provided a short history of my connection with 
this amazing document as well as a sampling of some of the results. 

Suffice it to say that the “Lost” Notebook contains a substantial collection of 
the discoveries the Indian genius, Ramanujan, made during 1919-1920, the last 
year of his short life. Furthermore, nothing was published on any of the formulas 
in the “Lost” Notebook until the appearance of [2] in 1979. 

One of the formulas presented in the introduction was [7; p. 47], [2; p. 90, 
eq. (1.3)] . 


x 5n+2 x 5n+3 
> gon tan 1+ d _ > gon tons 1+ dq 
1 10 1— gor? La 1-— grt? 
dq a Sn+1 x 5n+4 
1+ rr > gor 4? T q _ y gor t8nt3 1+ d 
14+ —_t___ 0 1-— gent! im 1— qgon*4 
q: 
1+ ri 
— 
(1.1) 


While cubing a continued fraction may be surprising, it is certainly not simple. 
There are, however, much less daunting results in the “Lost”? Notebook. Here 
are two that illustrate simplicity and surprise [7; p. 31]: 


? 


q q 
“G-g0-@) G-a0@-qd-qa-q) 
q° 
“G-o0-@)0-a-@a-a-4) 
(l+qitq?t--)-qlt+@t+q't+-) 


me 1.2 
1-—2q+2q* —2q’?+2q' —2q> +-- (1-2) 
and ’ 
q q° 
Vt mys ott oo ee 
(l+qg(it+q@) (+q@)tq@ i+) +4°*) 
Fe 
(l+gl+@jit@jl+g ita) +49) 
=(1-q'+q?--)+ql-@tq”? --). (1.3) 
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The sequence 0,11,13,... is {12n? + n}*__,, and 0,5,19,... is {12n* + 
Tn}"__,,. Of course 0,1, 4,9, 16,... is {n?}"_,. 

Please note the unreasonable fact that the passage from (1.2) to (1.3) consists of 
changing various signs and dropping the denominator on the right-hand side of 
(1.2). The implausibility of this transformation is a standard Ramanujan surprise. 

One may, of course, object that (1.2) and (1.3) are not all that simple. I will try 
to convince you otherwise in Section 2 with an account of related formulas that 
Euler found (or could have found) and that can be proved easily by mathematical 
induction. In Sections 3 and 4 we show why Ramanuyjan’s discoveries turn out to be 
surprisingly deeper than those of Euler’s, and while the formulas are simply stated 
their proofs are not that easy, another surprise. In Section 5, I briefly describe the 
number-theoretic implication of these results, and I conclude with a sketch of 
related discoveries by Ramanujan. 


2. BACKGROUND. We consider two formulas in this section. They look like 
slightly simpler cousins of (1.2) and (1.3). They are, it turns out, easily proved. 


q? q° 


q 
I-q @-a-@) G-d-a)d-a) | 
1 
"G-a0-#)0 a) a m 


q q? q° 


$f +5 + 

l+q (tqg)ltq*) Utaylta’ ita) 

=2-(1-qg)Q-@)l-@)-4') (2.2) 
Identity (2.1) is a special case of a formula due to Euler [1; p. 19, eq. (2.2.5)], 

and (2.2) can be deduced from a special case of Heine’s transformation of 


q-hypergeometric series [1; p. 19, Cor. 2.3]. However, each is a limiting case of two 
formulas whose proof is an immediate mathematical induction exercise. Namely 


1+ 


1+ 


n q! 1 
1+ (23 
ja A -@)(l-@*)-@) Gd -4@)(l- 4’) - 4") C9) 
n q! 1 
1+ SO 2 = 
Py (1+q4)(1+q°)--(1+q’) (1+ q4)(1+q*)--(1 +4") 
(2.4) 
The mathematical induction proof of (2.3) hinges on the identity 
1 1 
(1-q)(1-q?)" 1-4") (A-@-@) 1-4") 
‘ 1— (1 —- q” n 
(1 — 4") q (25) 


A -gl-@) Ga") A -9g)- 9) = 4") 
while the proof of (2.4) relies on 


1 1 
a (1+q)(1+q°)--( wal - 2 — (ltqy(l tq) (1+!) 
_ (l+q")-1 _ q° (2.6) 
(l+q)(1+q*)-(+q") (1+q@)(1+ 4°) +4") | 
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A complete detailed proof would be a good exercise for a college algebra class. 
If we let n tend to infinity in (2.3), we deduce (1.2). If we let n tend to infinity in 
(2.4), we find 


q? q° 


q 
i+q. (1+q)(1+4q’) ° (1+ q)(1 + q4°)(1 + @°) ° 
1 


Tegra ata) om 


and (2.7) is equivalent to (2.2) because of Euler’s famous infinite product identity 
[1; p. 5, eq. (1.2.5)): 


1+ 


1 
(1-q) (l-q*) G-q@) (l-4@*) (-@) 


(1-q)(1-.q°)(1-@)-- 
What could be simpler? 


By means of two famous results, the first by Euler [1; p. 11, eq. (1.3.1)] and the 
second by Gauss [1; p. 23, eq. (2.2.12)], 


(2.8) 


[Ta -q"ya1+ Y(t". +4"), (2.9) 
n= n=] 
and 

0 F on =1+ 22 (-1)"q", (2.10) 


we can recast (2.1) and (2.2) so that they look much more like (1.2) and (1.3). 
Namely by (2.9) 


q q° q° 


I-q GQ-a-@) | G@-ga-@i-@) 
1 
and by (2.9) and (2.10) 


1+ 


(2.11) 


q q° q° 


pF “Fg 
l+q (1+q)(lt+q*) (l+q)(l+q*)(1+@°) 
1-—2q + 2q* —-2q?+2q' —2q”> +-- 


1+ 


=2 (2.12) 


1 —g —q? + q° +q'— qi - gq? + wee 
3. RAMANUJAN’S FORMULA (1.2). I direct your attention to the fact that 
Ramanujan’s series on the left sides of (1.2) and (1.3) are obtained from the 
corresponding series in (2.1) and (2.2) by doubling the number of factors in the 
denominator, surely no big deal. 
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However, it is a big deal! For starters, we no longer have simple representations 
like (2.3) and (2.4) for the partial sums. For example, 


q q° 


fe + 
(l-q)(l-q@’) (-g)-@’)d-@)- 4") 
1-qt+q5—q?-q" 
ln q-@ t2g-g-q+q”’ 


1 


(3.1) 


a mess that gives no hint of the right-hand side of (1.3). Indeed, failure has met 
every attempt I know of to prove (1.2) and (1.3) simply. The only proofs I know rely 
on results deeper than (2.1) and (2.2). To prove (1.2), we require Euler’s full 
identity [1; p. 19, eq. (2.2.5)] 


z/ 1 
Sd-g-@)ye-@) =z) 291-297) (3.2) 


Jacobi’s Triple Product Identity [1; p. 21. eq. (2.2.10)], [6; p. 12, eq. (1.6.1)] one of 
the cornerstones of elliptic theta function theory 


oe oe 


y zngnn DP? = I] (1 _ q™ sya +zq™)(1 +z 'gm*t), (3.3) 


n=z-=x m=0 


and the lesser known but elegant Quintuple Product Identity [6; p. 134] 


oe 


y zingn@n"V)/2(1 + 2q") 


— Re _ qh 'ya + 2q)(1 +z'g™! 7z)(1 _ zgemt!)(] _ qem*! 727), 
(3.4) 
Hence 
1+ >) 4 


nay (1-q)(1- 4’) (1 - 9") 

—j,218 770+ )) 
“1+ 5) G pa -@) 0-4") 
- 1 1 


(the terms with odd n are zero) 


(by two applications of (3.2), the first with z = q'/*, the second with z = —q'/*) 
1 (1 + qi? )\(1 4+ qv? yl 4+ q?’*) cee +(1 _ qi’ )(1 _ qe? y(l _ q’/*) see 
(l-q)(1-@)-@)~ 


(by the same algebraic cancellation used in (2.8)) 
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4+ I] (1 _ gen*?y _ qrmtiye \(1 _ q?m*3/2 ) 


m=0 


N| 


is (l1-qg°"™*)}1—q)(1-@) - 4) 


m=0 


oC 


> grtn/2 4 - (-1)"qr it"? 


n=-s n=-=x 


No | eR 


: _ gent 4 qrmtl/e re 4 q?m*3/? ) 


= (the terms with n odd cancelled) 


IT (1 _ q?™*?) rH (1 _ q°™*') 


m=0 m=0 

[1( _ ge" *8)\(1 4+ ge?) 4+ q°’"*>) 
I] (1 q?™*?) T] (1 q?™*') 
m=0 m=(0 


(here we multiplied numerator and denominator by IT*,_,(1 — q?"*') and then 
rewrote the numerator using II%,_,(1 - qg?"*') = T1%,_,( - g8"*! —- 
ge *3 1 _ ge™*>\1 _ q’™*7)) 


oC 
12n°—n 12n°+7n+1 
»~ (4 —q ) 
n=-=<x 


~ 1 —2q + 2q' — 29° + 2q® —2g> 4+-°~ 


The last expression is finally the right-hand side of (1.2). The final numerator is 
obtained from the penultimate numerator by invoking (3.4) with g replaced by q° 
and then z replaced by —q. The final denominator is obtained from the penulti- 
mate denominator by invoking (3.3) with q replaced by g* and z then replaced 
by ~1. 

While no one rediscovered (1.2) prior to the unearthing of the “Lost” Note- 
book, Leonard Carlitz [4; eq. (24)] did prove a formula equivalent to (1.2). His 
identity is essentially the above stopping at the antepenultimate line. 

As hard as it may be to believe, identity (1.3) is an even tougher nut than (1.2), 
as we Shall see in the next section. 


4. RAMANUJAN’S FORMULA (1.3). Our starting point for (1.3) is a formula also 


taken from the “Lost” Notebook [7; p. 37] ((3; p. 137, eq. (1.1)], cf. [5; Ch. 1, §7)). 
This formula is substantially more difficult to prove than any of the background 


922 RAMANUJAN’S “‘LOST” NOTEBOOK [December 


formulas used in Section 3. 


1+ ¥ 4 


na, (1 + aq)(1 +a7'q)(1 + aq*)(1 + a~'q*)+*(1 +.aq")(1 + a~'q") 


a > (-1)'a 2n qhnt he 
= = (1 + a) by qo" grher (1 = a’g?"*!) — jt 
TTC + aq’ )(1 + a7!q’) 


(4.1) 


n 


Now set a = i(= v— 1) in (4.1) and take real parts of both sides 


ee Re((1+i po" n(3n+1)/2 1+ 2n+] 
nai (1 +q°)(1+q°)- (1 +4") yd " ara) 


_ y- (-1)"grOr* PQ + qin*!) 4 y- (-1)"g@rt PGI] + qin*3) (4.2) 
n= n=0 


— 
+ 
4 3 
| 


I need hardly remark that the passage from (4.1) to (4.2) has been a dramatic 
simplification. Whereas in (4.1) the right-hand side was a combination of infinite 
series and infinite products, the right-hand side of (4.2) is now a power series in q 
whose only coefficients are 0 and +1. 

From (4.2), identity (1.3) follows easily. Let us call the left-hand side of (1.3) 
f(q) and the left-hand side of (4.2) h(qg). Then 


x q-" 
1+ Lh aypate)+4a") 
Le g'(1 + (-1)") 
7 Gegytg) tg") 


f(q’) 


=1+ 


1 
= (A(q) + h(-4)) 


Even part of y (-1)"qrr* + git*!) 
n=0 


+IG )" genre DOnea(y +q°"*>) 
(by (4.2)) 
_ > grin ten _ > gq? t34n4 12 
n=0 n=O 


x x 
_ 24n-+14n+2 24n-+46n+4+22 
4 + )igq . 
n=0 n=0 


Hence 


x 


f(q@) _ > green _ q?2ntil) +q > gent _ gins), 
n=0 n=O 


which is (1.3). 
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5. APPLICATION TO PARTITIONS. We could easily derive many number-theo- 
retic results from the identities developed in Sections 3 and 4. We limit ourselves 
to one of the simplest. 

In additive number theory, partitions refer to unordered decompositions of 
integers into sums of positive integers. For example, the five partitions of 4 are 
4.34+1,2+2,2+1+1,1+1+1+1. Note the adjective “unordered”: 1 + 2 
+ 1 and 2 + 1 + 1 are considered identical. 

Let P,(n) (a = 1 or 3) denote the number of partitions of n into parts where 
only the largest part appears an odd number of times and the total number of 
parts is congruent to a mod 4. 

For example, there are four partitions of 6 in which only the largest part 
appears an odd number of times: 6,4+1+1,2+2+2,2+1+1+41+1.So 
P.(6) = 2 = P,(6). 

There are thirteen partitions of 12 in which only the largest part appears an odd 
number of times: 12, 10+1+1, 8+2+2, 8+14+1+14+1, 6+24+24+1+41, 
6414141414141, 44444 442424242, 4424241414141, 
4+14+14+14+14+14+14+141, 2424242424141, 24+24+2+1+141 
+14+1+1,24+1+14+14+1+14+14+1+1+1+1. So P,42) = 6 and P,(12) 
= 7, | 

The following theorem reveals that P,(m) and P,(n) never differ by more 
than 1. 


Theorem. For each n > 0, 


0 ifn # j(3j — 1)/2 
6j° +), 
677+ 57 +1, 
(-1)’ ifn = 672? +742, 
or 
6j° +117 +5. 


P\(n) — P3(n) = (5.1) 


Proof: The elementary techniques of partition theory [1; Ch. 1] reveal that the 
coefficient of zq” in 


x 


a es 
2D) & ee 29") (129) 


is the number of partitions of N into m parts wherein only the largest part 
appears an odd number of times. Since P(z, q) is an odd function of z, we see that 


(5.2) 


*~ 1 
D (Pin) ~ Py(n))q" = AC) 
I 


= 
ii 


x n 


_ ee 
= XL (1 + q°)(1 + q*) wee (1 4 q°") ’ (5.3) 


and the theorem now follows immediately by comparison with (4.2). 
6. CONCLUSION. The biggest surprise is how closely the formulas in (1.2) and 


(1.3) seem to be related. The numerators on the right-hand sides are essentially the 
same except for a few sign changes, while (1.2) has a classical theta series in the 
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denominator (c* gq” = 0,(0, q)), and (1.2) has no denominator. Yet these two 
formulas have very, very different proofs and are progressively harder than (2.1) 
and (2.2). 

There are companions to (1.2) and (1.3) that may be proved in very similar ways. 
Namely [7; p. 31] (cf. [4, eq. (25)]) 


n 


q 
9 (1-9)(1- 4°) - 4") 
(l+q't+tq’+-)-q(l+qtqrt-:) 


6.1 
1-—2q+2q* —2q?+2q' —2q”?+-°-° (6.1) 
and 
x q” 
2 aso0+e) ata) 
=(l-q’t+q"--)+q?(l-qt+q?-:) (6.2) 


Finally, this paper is just another sample (as was [2]). There has been an 
extensive series Of papers written on the “Lost” Notebook. Many of these are 
chronicled in [7; pp. xi-xxv]. Bruce Berndt and I are preparing a fully edited 
account of the work in the “Lost” Notebook, a project that will take some time. 
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The Catalan Numbers, the Lebesgue 
Integral, and 4"~? 


Wen-Jin Woan, Lou Shapiro, and D. G. Rogers 


In this note we give a new proof of a theorem that is simple to state, rather 
elegant, and little known. The proof we present is novel in that it involves a use of 
generating functions that is akin to Lebesgue’s parable about a shopkeeper 
totalling the receipts at the end of the day. The Riemann integral corresponds to 
just adding up the receipts in order but the Lebesgue integral corresponds to first 
sorting the receipts by denomination, totalling the amount of each denomination, 
and then adding up the subtotals. 

A path pair of length n is a pair of paths that start at the origin, consist of n 
unit steps and meet again for the first time after n steps. All steps in these paths 
go East or North. A path pair may also be called a parallelo-polyomino. Figure 1 
illustrates one such path pair of length 8. 


Figure 1 


The number of path pairs of length n is C,_,, where Cy,C,,C,... are the 
Catalan numbers. The first two proofs of this are by Levine [9] and Polya [10]. Our 
main theorem is that the total area of these C,_, path pairs is 4"~*. After a quick 
review of the Catalan numbers, we present brief proofs of both facts; the proof of 
the 4”~* result is new. A far different proof is alluded to in the paper of Firlinger 
and Hofbauer [4] and is attributed to a graduate student by the name of Schwar- 
zler. A different approach to a closely related result is given in [6]. An unsolved 
problem is whether the C,,_, path pairs of length n can be used as tiles to cover a 
2"-* x 2"-* checkerboard. The case n = 5 makes an amusing puzzle. The 14 
possible shapes are illustrated in Figure 2. To play the game, duplicate and enlarge 
the 14 pieces by, say, a factor of four, cut them out, and then try to arrange them 
so that they cover an 8 X 8 board. 
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Figure 2 


The Catalan numbers are ubiquitous in mathematics, showing up in the enu- 
meration of lattice paths, planar trees, ballot sequences, fluctuations of the lead, 
Young tableaux, stack sortable permutations, triangulations of an n-gon, increas- 
ing functions, binary trees, and so on. In the last few years they have become a 
standard item in combinatorics texts, so we will resist the temptation to go into 
detail, and refer instead to Comtet [3], Wilf [13], Stanley (especially the forthcom- 
ing volume 2) [12], and the Schaum’s outline by Balakrishnan [1], among many 
others. The entertaining survey by Gardner [5] is highly recommended. 

The Catalan numbers are defined recursively by C) = 1 and C,,, =C)C, + 
C,C,_-, ++: +C,C,). The first few terms are 1,1,2,5,14,42,132,429,..., the 


general term is 
C= —(2"| 
7 ntil\ny 


The generating function for the Catalan numbers is C(z):= Y7_,)C,z" = 
(1 — v1 — 4z)/(2z). We need the following three facts: 


C(z) =1 + 2(C(z))’, or C=1+2zC’ (1) 
C(z)v1 —4z =1-2C*(z), or Cv1-—4z =1-2C’ (2) 

and 
C*y¥1 —4z =1-27C%. (3) 


The first follows from the definition of C,, the third results from multiplying the 
first two, and a combinatorial proof of the second is included at the end of the 
article. 

If we sketch the first few cases we see that there are 1, 2,5, and 14 path pairs of 
lengths 2, 3, 4, and 5, respectively. Generalizing a bit, we let b(n, k) be the number 
of nonintersecting paths from the origin that are k apart after n steps. Call these 
partial path pairs. (The actual distance between the endpoints is ky2 , but we never 


need the ¥2 and omit it.) Table 1 illustrates the first few values of the b(n, k). 
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TABLE 1 


n\ k 1 2 3 4 5 6 7 
1 1 
2 2 1 
3 5 4 1 
4 14 14 6 1 
5 42 48 27 8 1 
6 132 165 110 44 10 1 
7 429 572 429 208 65 12 1 


The next observation is that b(n + 1,k) = b(n, k — 1) + 2b(n,k) + b(n, k + 1) 
for k > 2. This holds since the two paths can get closer at the last step in one way 
(the lower path goes north while the upper path goes east), they can stay the same 
distance apart in two ways (both east or both north), or they can get further apart 
in one way (lower path east, upper path north). This holds even for k = 1 if the 
nonintersecting condition is interpreted as b(n,0) = 0 for n > 1. 

Now an easy induction gives us 


b(n,k) = “( 2") 


Since partial path pairs that are one apart after n — 1 steps can be completed to 
path pairs in n steps, we see that the number of path pairs of length n is 


7 1 2(n — 1) 7 (2n — 2)! - 
pnt [AD | nh 


With the preliminaries out of the way we can proceed to the proof of the main 
theorem. The generating function for the first column of Table 1 is 


> C,2z" = C(z) — 1 =2C?(z). 
n=] 
The generating function for the k™ column is 
oo oo k 
¥ b(n, k)2" = (2€%(2))' = = (2" Jor, 
n=0 n 


as can be seen by first looking at the last place where the two paths are one unit 
apart, from there continuing until the last time the two paths are two units apart, 
and so on. The sequence of steps that takes the paths from m units apart (for the 
last time) to m + 1 units apart is the same for all m. 
If we square each of these generating functions, we find that 

= 2m ~ 2m 2n 
| » b(n,m)2" =(2C*(z)) = =| Jz" 
n=0 


n=2m n n— 2m 


Thus if we define 


oe 


Br (Zz) = | Y b(n, m)2" = )) B*(m,n)z" 


n=2m 


then 


B*(m,n) = Emon - j,m) = —"(, on a) 
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Figure 3 


The combinatorial meaning of B*(m,n) is the weighted number of path pairs of 
length n, where the weights are the number of times that the two paths are m 
units apart. Thus, B*(z) = “*_,,, B*(m, n)z” can be thought of as the generating 
function for the shopkeeper counting the number of m dollar bills. Similarly, 
mB*(m,n) is the area of all the slices consisting of m squares on a diagonal 
between two paths of a path pair of length n (or, more financially, as the amount 
of money coming in as m dollar bills). Here is an illustration of a typical path pair 
contributing three slices of length 2 to the count B*(2, 9). 

If we mark each square involved in each of these slices, then this path pair 
contributes six squares, as illustrated in Figure 4. 


Figure 4 


The main theorem will be proven if we can show that 


>) mB* (m,n) = 4"? 

m=\1 
This can be done by manipulation of binomial coefficients (see [7, p. 87] for a 
similar result and similar hand waving), but we proceed by generating functions. If 
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we tabulate the first few values of the B*(m,n), we have 


TABLE 2 
n\m 1 2 3 
0 0 0 0 
1 0 0 0 
2 1 0 0 
3 4 0 0 
4 14 1 0 
5 48 8 0 
6 165 44 
7 572 208 12 
and the column generating functions are z*C*, z*C*, z°C'’,... while the generat- 
ing function for the sequence 1,2,3,4,... is 1/(1 — z)?. Hence the generating 
function that we want is 
24 274)? 274)? 24 ; zc 
z°C* + 2(2°C*) +: 3(2°C*) t+ = 2°C* » ————— 5, = ————— 
(1 - z*C*) (C*V1 — 4z) 
2 x 
= : = qn-27n 
1-—4z  ,2, 


The second equality follows from equation (3). This completes the proof of the 
main theorem except for a proof of (2). We start by rewriting (2) as 


1 C 
vVl—4z  1—2zC 


Consider now a single path starting at the origin, with unit steps east and north, 


=C+C(2C?) + C(zC?) +. 


ending up on the line y = x after 2m steps. The number of such paths is (2) and 


n 


thus the appropriate generating function is 


~ (°" | n / 
y zn 
nag \ 2 v1 — 4z 

We decompose each such path into subpaths with a new subpath starting whenever 
the original path crosses the diagonal line y = x. The generating function for the 
number of paths that start and end on the diagonal but never cross is C. If we want 
to eliminate degenerate paths, the appropriate generating function is C — 1 = zC’. 
We take any one of these binomial paths and assume momentarily that it starts off 
below the diagonal. We would pick up a factor C until the first time it crosses the 
line y = x. A crossing implies that we have a nontrivial subpath on the other side 
of x = y and this continues for each crossing. The case in which we start by going 
above the diagonal corresponds to multiplying by the constant term 1 in the first 
factor C=1+2z+2z*+5z'+---, thus allowing for the possibility that the 
initial subpath below the diagonal has no edges. 

As with almost anything involving binomial coefficients, there is a large body of 
related literature. One good source that collects and extends many related results 
is the monograph of Bousquet-Mélou [2]. In [8] there is another moment counting 
problem involving a different Catalan setting, but where the final answer is a very 
similar 4”~'. Yet another 4"~' result in a Catalan setting is in [6]. This turns out to 
be analogous to our present result and indeed computes the higher moments as 
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well, but the proof is less combinatorial and more involved with the Eulerian and 
tangent numbers coming into play as coefficients. 
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Good Matrices: Matrices that Preserve Ideals 


R. Bruce Richter and William P. Wardlaw 


1. INTRODUCTION. The following problem of Mawaffaq Hajja appeared in the 
Monthly [5]: 


Let A and B be matrices with integer entries of sizes r by n and n by 7, 
respectively, with r <n. Suppose that AB 1s an r by r identity matrix. Show 
that A can be enlarged to an n by n integral matrix having an integral 
inverse. 


The case r = 1 is fairly well-known: if A =[a,,...,a,] and B =[b,,...,b,]’ are 
such that AB = [1], then a,,...,a, are relatively prime. In this case, there is an 
n X n integral matrix that has an integral inverse and whose first row is @,,..., @,; 
see [10, Thm. II.1, p. 13]. 

Hajja’s Problem suggests a natural generalization that has not received much 
attention and will be a main topic of this article: what are the properties that two 
Or more given rows must have so they can serve as the first rows of an invertible 
matrix? 

The property central to our paper is the following. A matrix A with entries in a 
communtative ring R with unity is left good if, for every vector x, the ideal (x A) 
generated by the entries in the vector x A is the same as the ideal (x) generated by 
the entries in the vector x. In the context of matrices with integral entries, this is 
equivalent to requiring that the greatest common divisor of the entries in x A is the 
same as the greatest common divisor of the entries in x. Since, for any matrix A 
and any vector x, it is obvious that (x A) C (x), the content of left goodness is in the 
reverse containment. Our goal is to prove the following. 


Main Theorem. Consider the following statements about an r X n matrix A over the 
commutative ring with unity R. 


(1) The rows of A extend to a basis of R'*". 

(2) A can be enlarged to an n X n matrix invertible over R. 
(3) A has a right inverse over R. 

(4) The ideal generated by all r X r subdeterminants of A is R. 
(5) A is left good. 


Then’ (i) (1) & (2) = (3) = (4) & 6). 
(ii) If R is a principal ideal ring, then each of (1)-(5) is equivalent to 


(6) A has Smith Normal Form [I, 0]. 
Hajja’s problem is solved by observing that, over the integers, (3) implies (2). 
We note that there is another direction in which the r = 1 case of Hajja’s 


problem can be generalized. From the Main Theorem we see that if a is a vector 
with entries in a principal ideal ring R (such as the integers or the ring k[x] of 
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polynomials in one variable over a field k [6, p. 180, ex. 9]) such that (a) = R, then 
a is the first row of an invertible matrix over R. This statement has been made 
quite famous by the proof that the result also holds over a polynomial ring 
k[x,,...,X,] in several variables over a field. This fact is crucial in the demonstra- 
tion of Serre’s Conjecture that every projective module over k[x,,..., x,] is free 
[8, pp. 488-492]. As far as we know, no one seems to have looked for a characteri- 
zation of those rings for which the r = 1 case holds. 

To allay the reader’s suspense, we include the following solution to Hajja’s 
problem, offered to us by a referee. This is essentially the published solution [5]. 


Proposition 1. Let A be an r X n matrix with integral entries. Let B be any integral 
matrix such that AB = I,. Then there exist integral matrices A’ and B' such that 


4 ]L2 B'| =1.,,. 


Proof: Let A,,..., A, be the rows of A and let B,,..., B, be the columns of B. 
We will add A,,, to A to get A’ and B,,, to B to get B’ such that A’B’ = [,, ,. 
For A,,,, choose any vector. in the left null space of B having relatively prime 
coordinates. 

Now let C be any column such that A,,,C = 1. We consider choices for B,,, 
of the form 


B,, =C+a,B,+- +a,B,. 


For i= 1,2,...,r the relation A,B,,, = 0 is equivalent to a, = —A,C. Hence 
B.,, =C —(A,C)B, — -:: —CA,C)B, is a suitable column by which to extend B. 
a 


2. AN INTRODUCTION TO GOOD MATRICES. In this section, we provide some 
examples to illustrate good matrices and prove some elementary facts about them. 
Consider the integral matrix 


If x = [x, x,] is any integral vector, then x A = [x, x, + x,]. If z = ax, + bx, is an 
integral combination of x, and x,, then z = (a — b)x, + b(x, + x,) is also an 
integral combination of x, and x, + x,. Therefore, the ideal generated by the 
entries in x 1S contained in the ideal generated by the entries in x A. As mentioned 
earlier, the reverse inclusion always holds and, therefore, this matrix A is left 
good. 

Consider next the integral matrix 


_ 1 1 
A-|_t if 
This matrix is not a good matrix since the ideal generated by x = [11] is Z, while 
the ideal generated by x A = [02] is the set of even integers. 
The following elementary properties of left good matrices are useful ones for us. 


Lemma 2. Let A and B be matrices with entries in the commutative ring R. 


(1) If both A and B are left good and AB is defined, then AB is left good. 
(2) If AB is left good, then A is left good. 
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(3) If AB =I, then A is left good. 
(4) If P and Q are invertible over R and PAQ is defined, then PAQ is left good if 
and only if A is left good. 


The reader will note that (3) is the implication (3) = (5) of the Main Theorem. 


Proof: 


(1) For any vector x for which x A is defined, (x) = (x A) = ([x AJB) = (x[ AB). 

(2) Since AB is left good, for any vector x, (x) = (x AB). Since (x AB) C(x A) C 
(x) equality holds throughout and (x) = (x A), as required. 

(3) The matrix / is trivially left good, so AB is left good. By (2), A is left good. 

(4) Immediate from (1) and (3). a 


We conclude this section with the relationship between being left good and the 
subdeterminants of the matrix, i.e., the equivalence of (4) and (5) in the Main 
Theorem. 


Proposition 3. Let. A be an r X n matrix with entries in a commutative ring R. Then A 
is left good if and only if the ideal D, generated by the r X r subdeterminants of A is R. 


Proof: Suppose A is left good and suppose D., is a proper ideal in R. Let R be the 
quotient ring R/D,.. Let A be the matrix with entries in R corresponding to A. All 
the r X r subdeterminants of A are 0 in R, so the rank of A is less than r. 

Therefore, there is a nonzero vector xX with entries in R such that xA = 0; this 
is a simple consequence of [9, Thm. 51, p. 159]. Let x be a vector with entries in R 
corresponding to x. Since A is left good, (x) = (x A) C D., whence x = 0 in Ra 
contradiction. 

For the converse, suppose D, = R but A is not left good. Let x be a vector with 
entries in R such that (x) # (x A). Since (x A) ¢ (x), (x A) ¢ R. In the quotient 
R = R/(x A), the vector X corresponding to x is not 0, but XA is 0. 

Then A has rank less than r, so there is a nonzero annihilator @ of the r X r 
subdeterminants of A. (We heartily recommend the treatment of the rank of a 
matrix given in [3, Ch. 4] or [9, Ch. VIII].) Thus, in R, a €¢ (x A) and aD. C (x A). 
This contradicts the assumption that D, = R, since then a € aD.. = 


3. THE SITUATION FOR COMMUTATIVE RINGS WITH UNITY. In this sec- 
tion, we complete the proof of (i) of the Main Theroem. We have yet to prove the 
equivalence of (1) and (2) and the implication (2) implies (3). The former equiva- 
lence is contained in the following fact, which is given in [11, Cor. 1.2}. 


Lemma 4. Let R be a commutative ring with unity and let P be ann X n matrix with 
entries in R. Then the following are equivalent: 


(1) The rows of P form a basis of R‘'*". 
(2) P has an inverse with entries in R. 
(3) det P is a unit in R. 


We remind the reader that an element a in a commutative ring R with unity | 
is a unit if there is an element b of R such that ab = 1. 
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(2) implies (3). Suppose A extends to the n X n matrix 


[dl 


with A* invertible over R. Let B* be an inverse of A* and write B* = [BB’'], with 
Bann Xr matrix. Clearly, 


wpe _[ AB AB'] _ 
ANB 43 an In 
which implies that AB = /,, as required. 
This completes the proof of (i) of the Main Theorem. a 


4. PRINCIPAL IDEAL RINGS. A principal ideal ring is a commutative ring R 
with unity in which every ideal is principal; i.e., if J is an ideal, then there is an 
element a © R such that J = (a). We begin with a discussion of the Smith Normal 
Form of a matrix with entries in the principal ideal ring R. 

Let U, denote the set of all k Xk matrices invertible over R. Two r Xn 
matrices A and B over R are equivalent, written A ~ B, if there isa P © U. anda 
Q &€ U, such that B = PAQ. It is easy to see that ~ is an equivalence relation. 

Let A be an r X n matrix with entries in the principal ideal ring R. We assume 
for the moment that r <n. A Smith Normal Form of A 1s an r X n matrix 


0 d, 0 0 0 - 0 


r 


such that A ~ S(A) and (d,) € (d,_;) © +++ € (d,). 

If it happens that A is an n Xr matrix with r <n, then the Smith Normal 
Form for A is the transpose of the matrix displayed in (1) and is still denoted by 
S(A). 

A matrix A can have many different Smith Normal Forms. For example, we can 
replace any d, by d,u, for any unit uw in R. Over the integers, we usually require 
d; => 0. 

Two elements a and b of a commutative ring R are associates if there is a unit 
u of R such that a = ub. Over general commutative rings, we can have (a) = (b) 
without a and b being associates. (See [1], [2], or [3, Ex. 5.15, p. 42].) The following 
result is proved in [3, Thm. 15.24, p. 194]. 


Theorem 5. Let R be a principal ideal ring and let A be any matrix with entnes in R. 


Then: 
(1) A has a Smith Normal Form; 
(2) Suppose d,,d,,...,d, and d',,d,...,d'. are the diagonal entries of twor X n 


matrices that are otherwise zero and suppose the d, are the entries of a Smith 
Normal Form of A. Then the d’, are the entries of a Smith Normal Form for A if 


and only if, for each i = 1,2,...,r, d’ is an associate of d,; 
(3) With d,,d,,...,d, as in (2), the ideal generated by the i X i subdeterminants of 
A is (d,d,...d,). 
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We remark that Theorem 5 is trivial if R is a field, while the proof when R = Z 
is aided by the Euclidean algorithm. Basically, one wants to do reversible row and 
column operations to turn A into S(A). For an arbitrary principal ideal ring, the 
proof is somewhat technical. 

We are now prepared for the proof of (ii) of the Main Theorem. 


Proof of MT (ii) It suffices to prove that (4) implies (6) and (6) implies (2). 

(4) implies (6). The ideal generated by the i Xi subdeterminants of A is 
(d,d,...d,). We are assuming that (d,d,...d,) = R, which means each d, is a 
unit. Hence, the diagonal entries of a Smith Normal Form for A are all units, 1.e., 
the matrix [J, 0] is a Smith Normal Form for A. 


(6) implies (2). The hypothesis is that A = P[J.0]Q, for some P € U, and 


QeU,. Then 
. P O 
*-[o Je 


is an extension of A to an n Xn matrix A™* with inverse 


Qu! po | a 


0 TL 


n—r 


The proof of (6) = (2) works over any commutative ring R with unity, while 
(4) = (6) depends only on the existence of a Smith Normal Form for A. Thus, 
these implications do not depend directly on the fact that the ring is a principal 
ideal ring. However, matrices over general commutative rings do not necessarily 
have Smith Normal Forms. This is where we use the assumption that R is a 
principal ideal ring. 


5. RELATED MATTERS. In this section, we deal with items that are closely 
related to the Main Theorem but are not required for its proof. 

The proof of Proposition 1 requires only the following property of the ring R: 
for any r <n and any given r Xn matrix A with a right inverse, there is some 
vector v such that Av = 0 and (v) = R. Thus, this condition on R is sufficient for 
the implication (3) => (2). 

Conversely, if (3) = (2) always holds over R, then we claim that for any r <n 
and any given r X n matrix A with a right inverse, there is some vector v satisfying 
Av = 0 and (v) = R. To See this, note that A extends to a matrix 


4 


that is invertible, with inverse |B  B’|. If vis any column of B’, then Av = 0. On 
the other hand, there is a row u of A’ such that uv = 1, so (v) = R. 

More generally, for principal ideal rings we have the following fact, which we 
need. later. 


Proposition 6. [fr <n and A is any r X n matrix with entries in a principal ideal ring 
R, then there is a vector v such that Av = 0 and (v) = R. 


Proof: By Theorem 5, A has a Smith Normal Form, so there are invertible 
matrices P and Q and an r X r diagonal matrix D such that 


PAQ = [DO]. 
Thus, AQ = [P~'D 0], so we may take the last column of Q for v. | 
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The case r = n of the Main Theorem is quite interesting in its own right, since 
in this case all six of the statements are equivalent. In particular, we note the 
following (see [12)). 


Proposition 7. Let R be a commutative ring with unity and let A be ann X n matrix 
over R. Then A is left good if and only if A is invertible over R. 


Proof: By the equivalence of (4) and (5) in the Main Theorem, A is left good if 
and only if (det A) = R. Clearly, (det A) = R if and only if det A is a unit. By 
Lemma 4, det A is a unit if and only if A is invertible. | 


The same referee who in Proposition 1 improved our solution of Hajja’s 
problem offered the following observation. Suppose that, over any commutative 
ring with unity, A and B’ are (n — 1) X n matrices such that AB = I,_,. Then A 
and B extend to n X n matrices A’ and B’ such that A’B’ = I. For example, in 
the case n = 3 and r = 2, let A, and A, be the rows of A and let B, and B, be 
the columns of B. Then A,B, is 1 if ¢ =j and is O otherwise, L.e., A,B; is the 
Kronecker delta 6,;. The additional third row A, to add to A must be orthogonal 
to both B, and B,, which virtually forces the choice A, = B, X B,. Similarly 
B, = A, X A,. That A,B, = 1 follows from the relation ([4], ex. 9, p. 425) 


(a X b)- (ce X d) = (a-c)(b-d) — (a-d)(b-c). 


For any n > 2, define the generalized cross product ®(u,,...,u,_,) of n- 1 
vectors U,,...,U,_, to be the formal m Xn determinant whose first row is the 
formal vector e,,e,,...,e, and whose remaining rows are u,,...,U,,_,, where e, is 


the standard unit vector of all 0’s except for a 1 in the i position. We need the 
following (relatively straightforward) properties of ®: 


(1) for each i = 1,2,...,n — 1, u,- @(u,,...,u,_,) = 0; and 
(2) @(u,,...,u,_,;)° @(y,...,Vv,_,) = det(u; - v,). 


Now suppose A and B’ are (n—1) Xn matrices with AB =I, _,. Let 
A,,...,A,_, be the rows of A and let B,,...,B,_, be the columns of B. Then 
we can add the row ®(B,,..., B,_,) to A and the column ®(A,,...,A,_,) to B 
to create square matrices A’ and B’ satisfying A’B’ = /,. 

Thus, for any commutative ring with unity, if A and B satisfy the assumptions 
of Hajja’s problem and r =n — 1, then A and B can be extended to invertible 
matrices. 

It is straightforward to show that if A is a left good matrix, then any matrix 
obtained from A by deleting rows is also left good. We have not found this 
particularly useful. However, it would be interesting to know the answer to the 
following open problem: Is every left good r X n matrix over a commutative ring R 
equivalent over R to the matrix [J, 0]? This would show the equivalence of all six 
statenfents in the Main Theorem for any commutative ring. 

We also note the following fact, which shows that left good matrices over 
principal ideal rings can be “arbitrarily bad” for multiplication on the right. 


Proposition 8. Let r <n, let A be an r X n left good matrix over the principal ideal 
ring R, and let y © R’™'. Then there is an x © R"™'! such that (x) = R and Ax = y. 


Proof: By the Main Theorem, there is a matrix B such that AB = I. Let x, = By. 
Since B is right good (defined in the obvious way), we have (y) = (By) = (x,). 
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By Proposition 6, there is a v such that (v) = R and Av = 0. Set x = x, + v. 
Then Ax = y. Thus, (x,) = (y) = (Ax) C (x). We conclude that (v) = (x — x,) ¢ 
(x), so R = (v) C (x), as required. a 


In conclusion, we would like to introduce one more idea. A matrix A 1s weakly 
left good if (x) = R implies (x A) = R. Clearly, every left good matrix is weakly left 
good. 


Proposition 9. Suppose R is a principal ideal ring. Then A is weakly left good over R if 
and only if A is left good over R. 


Proof: Only one implication is nontrivial. The proof hinges on the following fact. 


Lemma 10. Suppose R is a principal ideal ring. If w is a row vector and (d) = (w), 
then there is a vector y such that w = dy and (y) = R. 


Given this fact, suppose A is weakly left good. We must show that if wA is 
defined, then (WA) = (w). Let d be an element of R that generates (w). Then by 


Lemma 10 there is a vector y over R such that w = dy and (y) = R. Because A is 
weakly left good, we have (yA) = (y). Clearly, (w) = (d) = dR = d(yA) = (wA), as 
required. | 


Proof of Lemma 10. Note that w has Smith Normal Form [d0...0]. Thus, there 
are invertible matrices P and Q over R such that w = P[d0...0]Q. Since P = [c] 
is 1 X 1, it follows that w = [d0...0]Q’, for the invertible matrix Q’ = cQ. Then 
w = d[10...0]Q’, so we may take y = [10...0]Q’. a 
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A Converse of the Mean Value Theorem 


Jingcheng Tong and Peter A. Braza 


1. INTRODUCTION. The mean value theorem for differentiation has a common 
geometric interpretation: there is a tangent line to the curve y = f(x) defined on 
[a, b] whose slope is the same as the secant line through the endpoints (a, f(a)), 
(b, f(b)). But does every tangent line have a corresponding parallel secant line? 


2. A DIFFERENT VIEW OF THE MEAN VALUE THEOREM. The mean value 
theorem is: If f is continuous on [a, b] and differentiable on (a, b), then there is 
some c € (a, b) such that f’(c) = (f(b) — f(a) /(b — a). Alternatively, it can be 
expressed geometrically as: If f is continuous on [a, b] and differentiable on (a, b) 
then for any subinterval (a,,b,) <[a, b], there is a tangent line at a certain point 
(c, f(c)) that is parallel to the secant line passing through (a,, f(a,)) and (b,, f(b,)). 

We consider the converse: given a tangent line passing through (c, f(c)), is there 
a secant line passing through some points (a,, f(a,)) and (b,, f(b,)) that is parallel 
to it? The fundamental question we discuss is more precisely expressed as, for each 
c € (a,b), is there a nonempty sub-interval (a,,b,) C[a,b] such that f’(c) = 
(f(b,) — fla,))/(b, — a,)? We consider the strong form of the converse in which 
c € (a,,b,) and the weak form, which does not require c € (a,, b,). Geometri- 
cally, the strong form requires the secant line with slope f'(c) to intersect the 
graph of f on opposite sides of c, whereas the secant line need not intersect the 
graph of f on opposite sides of c in the weak form. 

A simple example shows that the converse need not hold in either the strong or 
weak forms. If f(x) = x* on [1,1], then f’(0) = 0 but (f(b,) — f(a,))/(), - a,) 
= (b} — a})/(b, — a,) > O whenever —1 <a, <b, < 1. In the following theorem 
we give the converse theorem in both the weak and strong forms. 


3. A CONVERSE THEOREM 


Theorem 1. Let f(x) be a function continuous on [a, b| and differentiable on (a, b) 
and let c be a given point in (a, b). Then 


(1) Weak Form: If f’(c) is not a total extremum value on (a, b), i.e., f’(c) # sup 
{f'(x)|x € (a, b)} and f'(c) # inf {f’(x)|x © (a, b)}, then there is some subinterval 
(a,,b,) C (a, b) such that f'(c) = (f(b,) — fla,))/(b, - a,). 

(2) Strong Form: If f’(c) is not a local extremum value of f’(x) on (a, b) and if 
c i6 not an accumulation point of the set A, = {x € (a, b)|f'(x) = f'(o)}, then 
there is a subinterval (a,, b,) C (a, b) such that a, <c < b, and f'(c) = (f(b,) - 
fla, )/(b, — ay). 


Proof: 

(1) If f’(c) is not a total extremum of f’(x) on (a, b) then there are c,,c, € (a, b) 
such that f’(c,) < f'(c) < f'(c,). Since f’(c,) = lim, , . f(x) — fOy)/@ — y) 
for 1 = 1,2, there are subintervals (x,, y,),(x5, y,) C (a, b) such that (f(x,) - 
fy D/G, -—y) < file) < G(x,) — fly) /(x, — y,). Without loss of generality, 
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suppose x, <x,. Consider the value K = (f(x,) — fly3))/(x, — y,). If K = f’(o), 
we are done. We need consider only the cases K > f’(c) and K < f'(c). 

K > f'(c). Since g(y) = (f(x,) — fly) /(x, — y) is continuous on (x,, b) and 
g(y,) < f'(c) while g(y,) > f’(c), there is a pointy between y, and y, such 
that g(y) = f’(c) or (f(x,) — fY) /(, — y) = f'Co). 

K < f'(c). Since h(x) = (f(x) — fly,))/(x — y,) is continuous on (a, y,) and 
h(x,) > f'(c) while h(x,) < f’(c), there is a point x between x, and x, 
such that h(x) = f’(c) or (f(x) — fly, )/(% — y,) = f'(o). = 


(2) If f’(c) is not a local extremum of f(x) on (a, b) then there is a subinterval 
(as, b§) C (a, b) such that c € (aj, b>) and f'(c) is not a total extreme value of 
f'(x) on (a6, bj). Let a¥ = (aj + c)/2, bf = (b§ + c)/2, a*,, = (a*® + c)/2, b*,, 
= (b* + c)/2,-+» Then aj < af < + <aF <0 <e< 0 <bDR << bi < 
by and lim,_,..a* = lim,_,,.b7 = c. 

By part (1) of this theorem, for each subinterval (a*, b*) C (aj, b>), Gi > 0) 
there are a;, b; such that (a,, b;) C (a*, b*) and f'(c) = (f(b,) — fla,))/(b; — a,). 
If c € (a,, b,) for some i, the theorem is proved. Hence we suppose c € (a,, b;). By 
the mean value theorem for f(x) on [a,, b,], there is some c, € (a,, b,) such that 
f'(c,) = (f(b) — fla;))/(, = a;). Hence f'(c) = f'(c,). Notice that since c; € 
(a,,b;) C (Ca¥, b¥), lim, _,.(b* — a¥) =0, and c, #c, these c, cannot coincide 
infinitely often. This implies there is an infinite discrete sequence c, such that 
lim, ,.C,, = c. This is a contradiction since it implies that c 1s an accumulation 
point of the set A. = {x € (a, DI f'(x) = f'Co)}. a 

We have considered in Theorem 1 the case in which f’(c) is not an extremum of 
f'(x). The next theorem says that if f’(c) is a local extremum of f’(x) on (a, b) 
(i.e., a total extremum on some subinterval of (a, b) containing c) then the strong 
form of the converse either fails on that subinterval or f is linear in a neighbor- 
hood of c and therefore satisfies the strong form. Of course when f is linear, c is 
an accumulation point of A. = {x € (a, b)| f(x) = f'(c)}. When f is not linear in 
a neighborhood of an accumulation point c and f’(c) is a local extremum of f(x), 
then f must fail the strong form. 

If f’(c) is a local extremum of f’(x) on (a, b), an easy example shows that the 
weak form may still hold. If we define f(x) = x° on [—1,0] and f(x) = 0 on (@, 1], 
then f'(0) = 0 is an extremum of f’(x), x =0 is an accumulation point, and 


(f(B) — fla))/( B — a) = 0 for any a, B € (0,1) with a # B. 


Theorem 2. Let f(x) be a function continuous on [a, b| and differentiable on (a, b), 
and let c be a given point in (a, b). Then if f'(c) is a local extremum value of f'(x) on 
(a, b) (i.e., f'(c) is a total extremum value of f'(x) on some subinterval (a*, b*) C 
(a,b) containing c), then f is either locally linear about c or f'(c) # (f(B) - 
f(a))/( B — a) whenever a < c < B and (a, B) C (a*, b*). 


Proof: At suffices to consider functions f such that f(c) = 0 and f’(c) = O since we 
can replace f(x) by f(x) — (f(c) + f'(cx — c)). If f is linear in a neighborhood 
about c, we are done. Suppose there is a subinterval (a, B) C (a*, b*) such that 
a<c< BP and 0=f'(c) =(f(B) — fla))/(B — a). If fC B) = fla) > 0, by the 
mean value theorem applied to [a,c] and [c, B], there are c, © (a,c) and 
c, € (c,B) such that f’(c,) = (f(c) — fla) (ce — a) = —fla)/(c — a) < 0 and 
file.) = (CB) — flc) /C B — c) = fC B)/(B — c) > 0. This contradicts the fact 
that f'(c) = 0 is a local extremum of f’(x). When f( 8) = f(a) <0 we get a 
similar contradiction. If f( 8) = f(a) = 0, then f(x) = 0 on [a, B] or, since f is 
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continuous, there is a point p € (a, B) satisfying f(p) # 0. Applying the mean 
value theorem to both [a, p] and [p, B] yields values for f’(c,) and f’(c,) of 
opposite sign, which contradicts the condition that f’(c) is a local extremum of 


f(x). a 


4. EXAMPLES FOR WHICH THE CONVERSE FAILS. Theorem 1 shows that 
the converse to the mean value theorem may fail at extremum values of f’(x) and 
at certain accumulation points. In this section we give an example of a function f 
that has a countable number of points whose tangent lines to the curve y = f(x) 
do not have any corresponding parallel secant lines. The function f’ has extremum 
values at x = (2* — 1)/2* for k = 0,1,2,... and x = 1 and has an accumulation 
point at x = 1. We also give an example of a function g that has a derivative with 
an accumulation point at x = 0 but no secant line through points on either side of 
x = 0 whose slope matches that of the tangent line at x = 0. The function f fails 
both the strong and weak forms of the converse, while g fails the strong form but 


not the weak form. 
Define f as follows: 


x3, | x € [0,1/4] 
2(1/4)° + (x - 172)’, x € (1/4,5/8] 


2(1/4) + 2(1/8) + (x - 3/4y’, x € [5/8, 13/16] 
f(x) = 1 3 an | 3 Qk+l — 3 Qk+l — 3 
Jaasay assy +(e) f+ (=) », «£E 3k? 5k 


x 3 
oy (=| +(x- 1)’, x € [1,2]. 


The function f is continuous on [0,2] and differentiable on (0, 2). It is a monotoni- 
cally increasing function satisfying f’((2* — 1)/2*) =0 for k =0,1,2,... and 
f'() = 0 but (f(x) — fly) /( — y) > 0 for all x, y € [0,2] with x # y. Hence, 
every tangent line has zero slope at x = (2* — 1)/2* for k = 0,1,2,... and x = 1 
but every possible secant line has a positive slope. 

The function 


1 x? 
xesin-+— x>0 

x 2 
8(x) = 10 x= 

1 x? 
x?sin--—-— <x<0 

x 2 


satisfies g(x) > 0 for 0 <x < 1/2 and g(x) < 0 for —1/2 <x < 0 and has the 
derivative 


1 
3x* sin -—-—xcos—+x x>0 
x x 
g(x) = 40 x=0. 


1 
3x* sin -——xcos——x x<0 
x x 
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The value g’(0) = 0 is not an extremum of g’(x) since g’(1/(2n + 1)7)) = 
2/(2n + 1)7) > 0 for n > 0 and g'(-1/(2n + 1 — 1/n*)r)) = —3/(40n*) + 
o(1/n*) < 0 as n > x. However, since g’(—1/((2n + 1)7r)) = O and g’(1/(2n7)) 
= 0, the value x =O is an accumulation point of the set A, ={x€ 
(—1/2,1/2)|g’(x) = g'(0)}. There is no interval (a,b) C(—1/2,1/2) with 0 € 
(a,b) such that (g(b) — g(a))/(b — a) = g'(0) = 0 since gla) < 0 < g(b) while 
a <0 <b. Hence there is no secant line through points on either side of x = 0 
with the same slope as the tangent line at x = 0. Note that g’(1/((2n — 1/n’)z)) 
= —3/(4mn*) + 0 /n*) <0 as n > ~ 580 g'(1/(2n7)) = 0 is not an extremum 
on (1/((2n + 1)7),1/(2n — 1)7)) for every n > n, for some n,. Therefore, by 
Theorem 1, there is some subinterval (a,,b,) C A/(Qn + 17), 1/(2n — 1)7)) 
such that g’(1/(2n7)) = 0 = (g(b,) — g(a,))/(b, — a,). So, in any neighborhood 
of x = 0, the function g satisfies the weak form. 


5. COMMENTS. There are some open questions that are worth considering. We 
gave an example of a function with a countably infinite number of points at which 
the converse of the mean value theorem fails. Could these points be dense in the 
interval? Is it possible to have an uncountable number of points? Our conjecture is 
“ves” to the first question and “no” to the second question. However, since the 
converse may fail at accumulation points, we are left with doubts. 
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ta Eg et from Robert Kanigel’s book on Ramanujan? f: 

thilik: it reinffited ‘itiy belief in-the universality of the language’ of 

‘science. Here: were two totally dissithilar peopte both culturally. and « 

_ .temperamentally—from two totally different backgrounds.’ Bad 

knew very little about aspects of the other’s personal life, and when 

Hardy learnt later of Ramanujan’ s pérsonal problems they came as 

a complete surprise to him. Of course, Hardy himself was too 

reserved to let Ramanujan get even ‘a. whiff of his own concerns. 

But when they met, which they did almost every day for nearly 

three years, they were on éxactly the same wavelength—they spoke 

exactly the same language—they were totally intimate with each 
tethers: m the lafiguage of mathematics. 


From a review of R. Kanigel, The Man Who Knew Infinity: 
A Life of the Genius Ramanujan, Washington Square Press, 1992. 
- “Reviewed by R. Tandon in Resonance, December, 1996. 
Contributed by Richard Askey, University of Wisconsin. 
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Primes at a (Somewhat Lengthy) Glance 


Takashi Agoh, Paul Erdos, and Andrew Granville 


One can tell that 


17=2°+37 
19 =244+3 
37 = 57 + 27-3 
47 =2-5°-3 


53 = 3°-7-2:-5 
97 =3-5-7-2° 
are all prime, at a glance, since we have written each nm = A +B where each 
prime < vn divides exactly one of A and B (and thus n is coprime with every 
prime < vn). This strange procedure is thoroughly investigated in [1]; in general, 
it is quite a challenge to so write a given prime n since the product of the primes 
< vn is around ef! teh" | 
A similar but more complicated method to establish the primality of n goes as 
follows: let p, = 2 <p) =3 < - <p, be the sequence of primes < yn. Write 
n in the form 
n=N,+N,+-:° +N,, (1) 
where the set of prime divisors of each integer N, (not necessarily positive) is 
precisely the set of all the primes up to p,, other than p,. Then, for each 
j =1,2,...,k, we have (n, p;) = (N,, p;) = 1 (since p, divides N, whenever j # i), 
and thus 7 is prime. This way of determining whether n is prime leads to our title. 
It turns out to be fairly easy to prove that there always is a representation as in (1): 


Theorem. Given p, = 2 <p,=3< -:- <p,, the first k primes, and a positive 
integer n < (IIs, Pi)(Lia11 /P;); free of prime factors <p,, there exist integers 
N,,N3,.--, N, with 

n=N,+N,+° +N,, (1) 
where each |Nj| < I1*_, p; and the prime divisors of N, are precisely 

{ Pir Pas-++> Pr} \ { p;} 


In fact we shall determine all such solutions to (1) in our proof. 


Proof: Let m =TIf_,p;, and let m, = m/p, for each j. Assume that n is an 
integer in the range 0 < n < Lf_,m,, which is coprime to m. Define a, to be the 
least positive integer for which 


n=m,a, (mod p,) 


for each j (such an a; exists since (m,, p;) = 1). Moreover 0 < a; <p; since 
(n, p;) = 1 (because (n,m) = 1 and p, divides m). 
Define N= m,a, +m,a, ++ +m,a,. Since each a, > 1, we deduce that 


N= iam j =n. Also, since p; divides m; whenever j # i, we deduce that 
N =m,a, =n (mod p,) 
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for each j, and so N =n (mod m) by the Chinese Remainder Theorem. Thus we 
may write N =n + Am for some integer A > 0. We also note that since each 
a, <p, thus Am <N<m,p, +m)p, + °°: +m, p, = km, so that A <k. 

Now in any solution to (1), m, divides N for each j, by the hypothesis, so we 
can write N, = m,a, for some integer a,. Since p, divides m,, which divides N,, 
whenever j # i, we deduce from (1) that 


\ 
ma,=n=N,=m,a, (mod p). 
Therefore a; = a, (mod P;) since (m,, p,) = 1. 
The condition |N;|<m is equivalent to the condition la,|<m/m, = p,. The 


only integers that are <p; in absolute value, and = @, (mod 'D)) are a, itself and 


a, — p;. Therefore a, = a; — 6p, where 6; = 0 or 1. “1. Conversely if a, = a; — 6p, 


where 6, = 0 or 1, then |a,| <p, so that |N,| <m. Moreover, all of the prime 
divisors of a; are then < p, and thus the prime divisors of N, = m,a, are a subset 
of {p\,P2,.--, Px} \ {p,}, as required by the hypothesis. Also, since m,a; = 
m,a, — 6m, we have 
N, +N, + °° +N, = m,a, + mya, +--+ +m,a, 
= (m,a, — 6m) + (m,a, — 6,m) + + +(m,a, — dm) 
=N-— (6, +5, +- +6,)m. 
Therefore (1) holds if and only if 6, + 6, + --- +6, = A, where each 6, = 0 or 1. 


Since 0 < A <k, it is evident that there are solutions to this, and that they are 
given when exactly A of the 6, equal 1, and the rest of the 6, equal 0. 


Example. To clarify the notation in the proof above we show how to find all 
solutions to (1) for the example with n = 101 and k = 4: 
We have 


Pi =2, Py =3, p3=5, and p, = 7, 

so that m = 2-3-5-7 = 210 and 
m,= 105, m,=70, m,= 42, and m, = 30. 

From some simple modular arithmetic we determine that 

a =1, a,=2, a,=3, anda,=5, 
which leads to N = 105-1+ 70:2 + 42-3 + 30-5 = 521. Therefore A = 
(N — n)/m = (521 — 101)/210 = 2; and thus if a, = a, — 6m, for j = 1,2,3,4, 
then exactly two of the 6, = 1, the other two of the 6, = 0. This leads to (5) = 6 
representations of 101 as in (1), namely: 


101 = —105 + 140 + 126 — 60 = 105 — 70 + 126 — 60 = —105 — 70 + 126 + 150 
= —105 + 140 — 84 + 150 = 105 + 140 — 84 — 60 = 105 — 70 — 84 + 150 


Corollary. Every prime n = 11 may be ‘proved’ to be prime by expressing it in the 
form (1), where p, = 2 <p) =3 < - <p, are precisely the primes up to yn, and 
N; ts the product of all of those primes other than p,. 


Proof: For each prime 11 < n < 47 we verify the result by computing an appropri- 
ate expression of the form (1): 

11=34+2°; 13=37+2°; 17=3°4+23; 19=3+2'; 23 =3°-2?; 

29 = 3*-5-—2-5-2:3; 31=3-542-54+2:3; 37=3-54+2:5+2?:3; 

41 =3-54+2°-54+2-3; 43 =3-542-:542-3?; 47=3-5427-542?-3. 
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That such expressions exist for each prime n > 53 may be deduced directly 
from our Theorem, by using the following Lemma to ensure that the hypothesis of 
the Theorem holds when we take p, = 2 < p,» =3 < -:: <p, to be the primes up 
to vn. 


1 
Lemma. /fn > 49 then (11, < @P) [Z, <=] > Nn. 


Proof: Bertrand’s postulate asserts that there is a prime in the interval (x, 2x] 
whenever x > 1. In particular, there are primes gq © (Vn /2, Vn] and 
re (Vn /4, vn /2]. 
Therefore, if n > 400, then 2,3,5,q,r are distinct primes < yn, so that 
1 1 vn vn 
[ T1?} 


1 1 n 
—| > 2-3-5-ger-|— + — + —] = 3lgr > 31— — on. 
ped de | ar [s 3 | a 2 4°" 
= psyn 


If 121 <n < 400, then 2,3,5,7,11 are distinct primes < yn, so that 


( TL») 1 1 


1 1 1 1 
yy - 2235-7 (5454545477) =2027>n. 


If 49 <n < 120, then 2,3,5,7 are distinct primes < vn, so that 


(IL? 


psyn 


1 1 1 1 1 
~ - 22-35-17: [545 +5 +5)-27>n 
pei 2°3°5 7 


In many ways, this proof of primality seems to be entirely without merit—one 
needs to know all of the primes < yn for it to be useful—moreover, the 
expression in (1) is, in practice, ridiculously long. However, it does express a proof 
of primality in a single, albeit unwieldy, expression. 


DEDICATORY. Paul Erddés passed away on 20th September, 1996, just a few weeks after this paper 
was accepted for publication. 
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The Mathematics Education Reform: 
Why You Should Be Concerned and 
What You Can Do 


H. Wu 


To Chih-Han Sah, in memoriam 


1. INTRODUCTION. In 1962, when the New Math was still on its ascent, 75 
leading mathematicians published an open letter in this MONTHLY to chide the 
New Math for its excesses [11]. It decried empty abstraction and rigid formalism, 
and made a strong case for learning the fundamentals of traditional mathematics: 
“elementary algebra, plane and solid geometry, trigonometry, analytic geometry 
and the calculus.” It also affirmed the importance for students to be able 


to use mathematical language with some fluency, ...to find proofs and, what 
may be the most important activity, to recognize a mathematical concept in, 
or to extract it from, a given concrete situation. 


Unfortunately, this forceful statement is now all but forgotten in the mathematical 
research community. 

Thirty-five years later, we are faced with another mathematics education reform. 
In the usual way in which this term is understood, it refers to both the K-/2 
mathematics education reform led by the National Council of Teachers of Mathe- 
matics (NCTM) and the calculus reform. This reform once again raises questions 
about the values of a mathematics education, this time not by imposing empty 
abstractions and rigid formalisms, but by redefining what constitutes mathematics 
and by advocating pedagogical practices based on opinions rather than research 
data of large-scale studies from cognitive psychology. 

The reform has the potential to change completely the undergraduate mathe- 
matics curriculum and to throttle the normal process of producing a competent 
corps of scientists, engineers, and mathematicians. In some institutions, this 
potential is already a reality. 

The purpose of this article is to discuss briefly some of the salient features of 
the reform, explain why the stakes are so high this time around, and finally point 
out some possible avenues for individual and collective action by mathematicians. 
Real progress in changing the direction of the reform will come only when 
reasoned arguments are heard from the whole mathematical community. 


2. SOME SPECIAL FEATURES OF THE REFORM. The reform has its merits. 
For example, it has replaced some of the rote-learning in the traditional curricu- 
lum by supplying motivation and heuristic arguments. It has made students aware 
of the normal process of doing mathematics, such as making conjectures and 
looking for counter-examples. It has also made mathematics more relevant to the 
average student by promoting the use of realistic applications in the curriculum. 
This section is devoted to a few areas of concern in the reform in order to furnish 
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a basis for discussion in the subsequent sections; see [24] and [25] for a more 
detailed discussion as well as a more complete list of references. 

The main focus of this article is not on the reform as an abstract idea—who 
does not want to improve education?—but rather on its concrete manifestations in 
the classroom or its explicit statements concerning instructional issues. Education 
is not a theoretical construct, so the reform must be judged by its performance and 
not by its rhetoric. 

The first area of concern is the cavalier manner in which the reform texts treat 
logical arguments; this may well be the most conspicuous deviation from previous 
educational practices. Whenever a justification for a mathematical statement is 
given in reform texts, it is not made clear as a rule whether it is a heuristic 
argument that is very far from a proof or even fallacious, or actually a valid proof. 
Bald assertions without justifications are also made without comment. At the same 
time, the reform prides itself on its vigorous promotion of “higher order thinking 
skills” and ‘“‘mathematical reasoning” (e.g., [13, pp. 5-6], and [10, pp. 20—21]). This 
uneasy alliance between two inherently contradictory positions breeds many awk- 
ward situations. For example, a recent article [20] published in the official journal 
of the National Council of Teachers of Mathematics advocates teaching trigono- 
metric identities such as sin2x = 2 sin x cos x solely by graphing each side on the 
screen of a graphing calculator and observing that they coincide. There is no 
mention of a proof. Now if the authors had said that “in addition to proving the 
identity sin2x = 2sin x cos x, using the graphing capability of a calculator can 
reinforce students’ confidence in the abstract argument,” we could have applauded 
them for making skillful use of technology in the service of mathematics. 

Let us consider another example. On p. 208 of [7], the derivation of the 
quotient rule (f/g) = (f’g — fg')/g* is given as follows: Let Q =f/g, then 
f = Qg. Differentiate both sides, employing the product rule for the right side, and 
solve for Q’ to get the requisite formula. In this case, the formula is obtained by 
making use of the differentiability of Q, which is in fact part of what must be 
proved in the first place. Nevertheless, the circularity of this argument is allowed 
to stand in the face of prior assurances to the students that this 1s an “informal but 
mathematically sound justification” [7, p. ix], because the prevailing thinking is that 
this kind of intuitive argument is just right for most beginning math students. But 
what has happened to higher order thinking skills in the meantime? 

One of the main goals of mathematics education enunciated in [11] was “to 
find proofs.” In the current reform, this goal has been challenged, sometimes 
implicitly (e.g., [13, pp. 143-145], [16, p. 61]), and other times explicitly (e.g., [12]). 
In [12, p. 562], Mumford spoke for many reformers when he questioned why, in the 
context of calculus reform, we should even make a judicious and minimal presenta- 
tion of proofs. Why “train them in making logical deductions” at all? His main 
argument is that “logical deduction has no place” in the practices of the sciences, 
nor jn the lives of the rest of the educated public. While this point of view can be 
discussed on many levels, I shall follow [12] and simply stay within the context of 
calculus reform. Such an argument overlooks the fact that among the students in a 
typical calculus course are future math majors as well as serious users of mathe- 
matics. These two groups need rigorous mathematical training, and would not be 
satisfied with a steady diet of ‘persuasive heuristics,” graphic displays, and nothing 
else. They comprise considerably more than one percent of the calculus student 
population as suggested in [12, p. 563] (see, e.g., [24, p. 1536]). Unfortunately, most 
reform texts, notwithstanding the fact that they exclude these two groups by 
design, promote themselves as texts for all students. If they could explicitly make 
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known this exclusion, then a good deal of the present concern about the reform 
would instantly disappear. 

The argument of [12] also presupposes a severely utilitarian educational philos- 
ophy: if something is not useful, then throw it out of the curriculum. To a certain 
extent, even a liberal arts education makes some concessions to this philosophy. 
Recently, the Berkeley mathematics department voted to adopt as official text for 
“soft” calculus the applied version of [7], which is one of two books vigorously 
promoted in [12]. On the whole, though, most universities still manage to hold the 
line and endeavor to imbue students with the spirit of intellectual inquiry for its 
own sake. Students continue to be exposed, for example, to logical deductions in 
mathematics and poetic expressions in literature. Shakespeare has not yet been 
replaced entirely by Madison Avenue. There is something to be said about this 
time-honored tradition. 

The same utilitarian impulse is responsible for a second major area of concern 
in the current reform, which is the over-emphasis on relevance and “real world 
applications.” The need for applications in school mathematics curriculum 1s 
beyond debate, but are we willing to embrace a curriculum in which “the 
mathematics truly arises out of applications [and] the units are not centered 
around mathematical topics but rather application areas and themes, with the 
mathematical topics occurring as strands throughout the unit’ [1]? The Interactive 
Mathematics Program (IMP) [8] comes close to realizing this rather extreme 
viewpoint, although all other reform texts succumb to its spell to varying degrees. 

Those who over-emphasize relevance in school mathematics appear to want to 
reclaim the attention of the sizable number of students who are turned off by 
mathematics, and to hone the working skills of prospective high school graduates 
in order to make them more employable in the high-tech industry [5]. Now 
mathematics is a cohesive discipline with a well understood internal structure. A 
mathematics education ought to cultivate students’ intellectual appreciation of this 
structure and cohesion. Reading the NCTM Standards [13]-[15], no one would 
believe that mathematics is getting its proper due in the present reform. To give a 
rather provocative example: a student coming out of a reform curriculum would 
not understand why the recent proof of Fermat’s Last Theorem is a landmark 
event in human culture. 

An application-oriented curriculum can furnish a valid mathematics education 
provided enough attention is given to mathematical closure. Tools developed for 
the purpose of solving a practical problem should be put in the proper mathemati- 
cal context, and abstract ideas distilled from such solutions should preferably be 
applied to completely different situations to demonstrate the fundamental role of 
abstraction in mathematics. Unfortunately, mathematical closure is hardly ever 
applied in the reform. When the NCTM Standards discuss the problem of finding 
the roots of the cubic 5x? — 12x* — 16x + 8 = 0 in the context of Grades 9-12 
[13, pp. 152-153], the only expectation of the majority of the students is that they 
consfruct an algorithm for approximating the real roots and test it on a graphic 
calculator. This is all. No mention to this group of the nature of the roots of 
polynomials (are the real roots rational?), or the existence of real roots (if the 
degree is odd?), or the existence of complex roots in general, etc. 

To put it in a musical context, an overly utilitarian approach to mathematics 
education is akin to impressing Beethoven’s greatness on school students by 
presenting him solely as the composer of the tunes for the Huntley-Brinkley Show, 
the Beatles’ movie Help/, and the recent TV ad for Acura. Even if we succeed, can 
we take pride in such a Pyrrhic victory? 
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A third area of concern in the current reform stems from the fact that 
mathematics is a precise technical language. Students must strive to master this 
new language. A tendency of the reform is to move mathematics completely back 
into the arena of everyday life where ambiguity and allusiveness thrive. A loss of 
precision in mathematical presentations is the result. One example is the way IMP 
[8] treats mathematical concepts. Consider the standard notion of the expected 
value of a random variable. In the IMP text, this concept is used throughout the 
second half of a unit called “The Game of Pig” [8, pp. 96-186], but it first appears 
in a homework problem as a term commonly used in everyday language [8, p. 134], 
and is subsequently never defined in the text proper. In the Glossary at the end of 
the whole book, it is stated: “Expected value: In a game or other probability 
situation, the average amount gained or lost per turn in the long run.” [8, p. 257]. 
Without arguing whether such a definition is usable from a student’s point of view 
(or whether it is even correct), I simply point out that in the Teacher’s Guide for 
“The Game of Pig,” teachers are alerted to the introduction of this new terminol- 
ogy, and that they are instructed to tell the students that “the concept of expected 
value is nothing new,... [but] the use of such complex terminology makes it easier 
to state complex ideas.”’ Whatever became of the goal to teach students to “extract 
a mathematical concept from a given situation”? Can this goal be accomplished if, 
instead of carefully guiding the students to perform the “extraction,” the text 
systematically embeds the mathematics in the vagueness and uncertainty of every- 
day life? 

Suppression of precision also takes the form of intentionally slighting basic 
algorithms and formulas. An example of the former is the various methods 
employed to avoid teaching even the basic multiplication and division algorithms in 
K-4. An example of the latter, the pre-calculus text [19] spends two pages 
(pp. 209-210) discussing the relationship between the measurements of an angle in 
degrees and radians, but assigns the discovery of the general formula relating the 
two to two exercises. Given this trend, we will soon see calculus texts which 
compute only derivatives of x’, x*, and x*, but relegate the formula for x” to an 
exercise; better yet, they will compute {,x° dx and {2 sin xdx but leave the 
statement and proof of the Fundamental Theorem of Calculus to an exercise. 
Where will this end? 

The preceding concerns all have to do with curriculum, but there are others of a 
different nature. The foremost is the relative neglect in the K-12 reform of the 
issue of teacher qualification. A main cause of the dysfunctional mathematics 
classroom of the seventies and eighties, which eventually led to the call for reform 
in [18], is inadequate knowledge of mathematics. In light of this, the present 
emphasis of NCTM on curriculum, pedagogy, and assessment methods in [13]-[15], 
with no commitment to a rigorous program of re-training of the teachers in the 
field and a strengthening of the future teachers’ mathematical education, practi- 
cally guarantees the continued mediocrity (if not failure) of mathematics education 
in ofr nation [23]. 

Another concern is with the new pedagogy, which relies heavily on constructivis- 
tic instructional strategies, such as cooperative learning and the discovery method. 
As a theory of learning, constructivism holds that the acquisition of knowledge 
takes place only when the external input has been internalized and integrated into 
One’s own mind. However, the current reform transforms constructivism into a 
theory of instruction ((10], [13]-[15]). In order to help along this mental construc- 
tion, class time in reform classrooms is reserved primarily for students to re-dis- 
cover Or re-invent concepts and methods of solution. Furthermore, this process of 
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re-discovery is facilitated by the use of cooperative learning. Having students work 
together in small cooperative groups is a preeminent characteristic of the present 
reform effort in school mathematics. In such a learning environment, the teacher 
ceases to be “the sage on the stage” and instead serves as “a guide on the side.” 

While a little bit of group learning and guiding-on-the-side is good in the 
classroom, too much of this is happening in the reform classrooms to the detriment 
of good education. When cooperative learning rules, teachers cannot share their 
insights with students or warn them against pitfalls. Moreover, students cannot 
learn enthusiastically from their teacher in class and do the mental construction at 
home. Just how much substantive mathematics can be learned this way? 

One would be quite mistaken to regard my critical comments on the pedagogi- 
cal and instructional recommendations of the reform as nothing more than a 
mathematician’s objection to facts well-established in cognitive psychology. It 1s a 
sobering experience to read the articles by Grossen [6] and Anderson-Reder-Simon 
[2], which provide critical assessments of these recommendations by an educator 
and three cognitive psychologists, respectively. In particular, the former points out 
the complexity in any successful application of cooperative learning and the lack of 
large-scale studies to support its unrestricted applications, while the latter takes to 
task many of the instructional prescriptions derived from constructivism. 


3. WHY IT MATTERS. The most obvious reason why school mathematics educa- 
tion should matter to university professors is that a continuing influx of mathemati- 
cally incompetent students would decimate the university mathematics curriculum. 
One can look no further than the United Kingdom to have one’s worst fears 
confirmed. If a report released by the Council of the London Mathematical Society 
in October, 1995, is to be trusted, then the UK is some five years ahead of us in a 
mathematics education reform remarkably similar to our own in its rhetoric. If our 
reform takes hold, then according to [22], we can look forward to a generation of 
students with: 


(i) a serious lack of essential technical facility—the ability to undertake 
numerical and algebraic calculation with fluency and accuracy; 
(ii) a marked decline in analytic powers when faced with simple problems 
requiring more than one step; 
(iii) a changed perception of what mathematics—in particular of the essen- 
tial place within it of precision and proof. 


But the worst is yet to come. For example, to the charge that the Harvard 
Calculus [7] passes students through calculus without requiring any algebraic skill, 
one reply was that students’ symbolic manipulative skills are much weaker than 
they used to be, and so some symbolic manipulation should be eliminated from 
calculus. In the same vein, in response to the charge that students pass through 
reform calculus with at best a rudimentary knowledge of algebra, the comment 
from reformers was that we did this long before calculus reform. 

Instead of trying to uphold a certain standard and help mold as-yet-unformed 
minds, educators simply accept deterioration in the classroom as a given. It would 
be only a small step to apply such a philosophy in earnest to demand a total 
revamping of undergraduate, and even graduate, mathematics programs in order 
to fit the deficiencies of the new generation. In point of fact, such suggestions have 
already been made. For example, [9] recommends that we “Change the first two 
years of collegiate mathematics to match the new K-12 curriculum.” Not coinciden- 
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tally, the opening statement of the Precalculus Project of the Calculus Bridge 
Consortium Based at Harvard University echoes this sentiment word for word: 
““Given the success of the reform calculus movement, students and teachers want 
reformed courses both preceding and following calculus” [4]. The mathematics 
department of a major state university started to revise all its upper divisional 
courses in May of 1996 in order to “mesh with the aftermath of the Harvard 
Calculus reform.” 

The logic of the reform is inexorable: once the reform is entrenched in K-12, 
university mathematics courses will have to follow suit. The next step will be 
inevitably a demand for reform in the graduate program. Thus in no time at all, 
the burning question of the day will be whether or not proofs are allowed only in 
graduate courses. 

We must object to the reform because it threatens to bring down the whole 
education system. Indeed, our students of today will be the teachers of tomorrow, 
so when university courses start to deteriorate our children will be taught by 
teachers who are mathematically worse-equipped than those of today. Then the 
next wave of students will perform even more poorly, and the poor performance 
will incite the educators to demand a second mathematics education reform. And 
the vicious circle will continue. Lest such worries be construed as sheer paranoia, 
let me quote a recent (1996) report from the organizer of a workshop for high 
school mathematics teachers in a Western state (who asked to remain anonymous): 


In the afternoon we started talking about the state of students’ preparation 
for calculus and all of them said it is getting worse year by year... . The 
picture they painted for me was one in which [the teachers] are nearly 
powerless to prevent what they see as a watering down of the curriculum 
because administrators, untrained in mathematics, are making the decisions 
based on reports filled with what they describe as NCTM jargon. One 
teacher... predicts that there will be no calculus course in three years 
because no one will be ready for it. 


The reform also raises a grave concern in a different context. The economic and 
social well-being of our nation is critically dependent on the existence of a robust 
corps of technicians in science and technology: the competent mathematicians, 
scientists, and engineers who evolve from school students gifted in science and 
mathematics. Because the reform favors weaker students, the top students end up 
being shortchanged, and the continuous supply of this technical corps is put in 
jeopardy. This problem is becoming so serious that it has alarmed the U.S. 
Department of Education. In a refreshingly straightforward document [17], it 
offers a criticism of the reform: 


Ultimately, the drive to strengthen the education of students with outstand- 
‘ing talents is a drive toward excellence for all students. Education reform will 
be slowed if it is restricted to boosting standards for students at the bottom 
and middle rungs of the academic ladder. At the same time we raise the 
“floor” (the minimum levels of accomplishment we consider to be accept- 
able), we also must raise the “ceiling” (the highest academic level for which 
we Strive). 


4. WHAT MATHEMATICIANS CAN DO. The open letter [11] is a remarkable 
document of sound educational principles in mathematics education, but it 
appears to be the only one that demonstrates the collective concern of the 
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mathematical research community for school mathematics in the past half century. 
It has left almost no marks, because tangible results in education can be achieved 
only by sustained effort. The absence of such an effort by the mathematical 
community at large, especially the research community, has allowed the traditional 
K-12 curriculum and the teaching of calculus to deteriorate, thereby opening the 
floodgate to a multitude of educational ideas of dubious merit. The reform is the 
natural product of this indifference. 

Professional mathematicians have an additional obligation to break this indiffer- 
ence and speak out against the defects of the reform, because teachers who wish to 
do so are under pressure to maintain a facade of compliance. As a teacher from 
Pennsylvania put it: “The ‘other side’ is making it very uncomfortable for teachers 
such as me, and we are dropping like flies. Whereas university professors like you 
can disagree with impunity, that same privilege is denied to those of us lower on 
the scale.” 

If we wish to shake off this indifference and enter into a discussion of 
mathematics education, then we have to enlarge our vision concerning the teaching 
of mathematics. We have to temporarily abandon the narrow focus of training 
future mathematicians and embrace the broader and more complicated issue of 
educating students who have diverse goals in life. We must also learn about the 
reality in schools where teachers are habitually overworked and have not the 
luxury of intellectual contemplation. Criticisms of the reform that do not take into 
account deviations from our normal “universe of discourse” are not likely to find a 
receptive audience. 

In discussing the reform, we also have to be aware of the existence of the many 
serious defects in the generic traditional mathematics curriculum in the schools 
[24]. A return to “business as usual” would be no cure. 

One last thing we need to be aware of is that, although our professional 
instincts compel us to insist on rigorously proving everything, there is no faster way 
to lose credibility as educators than to build our whole case against the reform on 
this one theme alone. It is far too easy, for example, to harp on the absence of e€-6 
proofs of the basic theorems of limit and continuity in the reform calculus texts, 
but a pedantic insistence on rigor is by no means the best approach to the teaching 
of elementary mathematics. It would be more realistic to ask that only the truly 
basic facts be proved in beginning courses and that there be careful differentiation 
between what is actually proved and what is not. Gaps can always be filled later, 
provided no circular reasoning is involved and provided the students are made 
aware of the gaps. 

What then can we do, individually and collectively? Here are a few suggestions. 

The situation regarding the calculus reform is relatively simple. Since it is being 
carried out mainly by our peers, we should press for a vigorous debate, not only in 
professional journals but also in every one of our own departments. If personal and 
anecdofal experiences serve, most mathematicians active in research regard the 
teaching of calculus as something unworthy of serious attention. The time to 
change this attitude is now before the reform gets out of control. 

The K-12 reform is inherently more complicated and calls for efforts in more 
than one direction. First of all, NCTM is currently revising its Standards (13]—[(15] 
for a second edition. It has created a Commission on the Future of the Standards 
and has asked several mathematics organizations, such as MAA, AMS, and SIAM, 
to create their own committees to work closely with the Commission over the next 
three years. These committees are to provide sustained advice and information. 
We should seek out members of these committees to give them our opinions on 
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the reform in general and on the Standards in particular. This is our chance to 
infuse the Standards with more mathematical substance and a more balanced 
viewpoint. 

In the meantime, we should offer our critical comments on the reform. In spite 
of pleas for the mathematical community to “speak with one voice” in support of 
the reform, we should keep up the fine critical tradition initiated in [11]. What is 
missing in the reform is the commitment to teach mathematics, in all its guises, 
without violating its integrity. If we mathematicians do not reaffirm this commit- 
ment, then who will? 

Something much less easy to achieve but immensely more important is for 
mathematicians to help improve the training of prospective school teachers. 
Mathematics education on the college level is, more often than not, aimed 
exclusively at producing future mathematicians. The usual college mathematics 
courses drill the students on the technical details of fundamentals in order to 
prepare them for graduate work in mathematics. But for those who leave mathe- 
matics after their college degree, e.g., school teachers, such courses yield brief 
glimpses of the trees but never the panorama of the forest. In the words of Allyn 
Jackson, such an experience in mathematics is akin to “finishing a BA in English 
literature having done a lot of technical analysis of Shakespeare but having no idea 
about Shakespeare’s stature in English literature.” Because less than 20% of math 
majors go on to do graduate work, we are addressing only 20% of our students while 
pretending to be teaching them all; see [3] and [23]. A narrow focus on producing 
future mathematicians is a significant factor in the inadequate mathematical 
preparation of our school teachers. 

There is no simple remedy for this educational difficulty. Larger institutions can 
schedule different sections of the same course to satisfy the divergent needs of the 
students. Smaller colleges can overcome this obstacle only by the extra dedication 
and ingenuity of instructors. We are all capable of making a contribution to this 
important matter just by being more conscientious in carrying out our normal duties. 

A third area for possible action is direct participation. For example: 


(A) Be an author of school mathematics texts. 

(B) Join a group that engages in curricular activities. 

(C) Act as consultant and critic on education. 

(D) Work directly with one’s own local school board or teachers. 
(E) Speak up as a citizen and do grassroots work. 


Regarding (B), the main difficulty is an almost unbridgeable chasm between 
educators in the K-12 reform and mathematicians, so any contribution we hope to 
make here requires establishing some mutual trust between the two groups. 
Regarding (C), despite exhortations by NSF and AMS for research mathematicians 
to partake of the education enterprise, there is in fact no support for critical 
educatjonal writing. On the other hand, NSF funded the writing of textbooks such 
as Earth Algebra (21). Life is indeed full of mysteries. 

Thus far, the most effective method of making one’s voice heard in K-12 edu- 
cation is by way of grassroots efforts. Prime examples of this are the various groups 
organized by parents in California, which played a substantial role in hastening the 
revision of the 1992 California Mathematics Framework [10]. These and other 
action groups serve the vital function of giving voice to alternative points of view 
and galvanizing dissent into action. If we can add our professional voices to the 
efforts of these groups, we can help create a potent force for change within 
education. 
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Confronting Reform 


Jeremy Kilpatrick 


Three times in this century, some members of the American mathematical commu- 
nity have attempted to reform school mathematics only to discover that others 
objected to the direction the reform appeared to be taking. In this paper, I sketch 
the issues at stake in the first two reform efforts and then turn to the third and 
current effort, giving particular attention to the critique offered by H. Wu [33]. The 
paper ends with some thoughts on the challenges of changing school mathematics. 


A UNIFIED CURRICULUM. At the turn of the century, reformers at the Univer- 
sity of Chicago High School and at several other Illinois schools were attempting to 
unify the secondary curriculum, principally by merging the year-long courses in 
algebra and geometry [25]. In 1903, E. H. Moore [15], retiring as president of the 
American Mathematical Society, gave a powerful impetus to the nascent reform 
efforts by devoting part of his presidential address to mathematics in secondary 
education. Moore called for “the unification of pure and applied mathematics” 
and “the correlation of the different subjects,” to be accomplished by organizing 
algebra, geometry, and physics into a “thoroughly coherent four years’ course.” 

Reaction to what was to become known as the “Chicago movement” was swift. 
Conservative mathematicians in the East, most prominently David Eugene Smith, 
although tolerant of the brash Midwesterners tinkering with new approaches, 
argued that the secondary classroom was a place for pure, not applied, mathemat- 
ics. In particular, the mental disciplinary power of geometry, together with its 
aesthetic and cultural value, demanded that it be kept in a separate course. 

By the time the final report of the MAA’s National Committee on Mathematical 
Requirements [13] appeared in 1923, much of the support for a unified curriculum 
had shifted to Grades 7 to 9 and away from Grades 10 to 12. The report’s authors 
wanted all students, many of whom were dropping out of school by the end of 
Grade 9, to have a broad view of mathematics and consequently proposed some 
integrated courses for the junior high school. They also suggested ways in which 
the curriculum of Grades 10 to 12 might be reorganized to connect algebra with 
geometry and to include some work in statistics and even calculus. They acknowl- 
edged, however, that although experimental unified courses were being developed, 
few high schools were adopting them. The movement to unify the mathematics 
curriculum was already fading away under attacks on mathematics as a required 
subject in secondary school and the growth of courses emphasizing the social uses 
of mathematics (primarily arithmetic). Today, the residue of the reform effort can 
be seen in the “general mathematics” course, the impoverished counterpart to 
first-year algebra. 


A MODERN CURRICULUM. The next wave of reform began to build in the 
1950s, as university mathematicians and school mathematics teachers joined forces 
to attempt to bridge what they saw as the widening gap between school and 
collegiate mathematics. Concerned that the “explosive development of mathemat- 
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ics” [4, p. 1] was not being reflected in the school curriculum, that too few students 
were entering college prepared to study advanced mathematics, and that the 
nation risked a serious shortage of mathematically trained personnel, reformers 
mounted a variety of projects to improve the teaching of school mathematics by 
developing new curriculum materials and retraining teachers. These efforts be- 
came known as the “new math’-—“a label not so much for a cohesive set of reform 
proposals and activities as for an era during which a variety of reforms were 
undertaken’”’ [26, p. 413]. 

Many of the reforms, but not all, were marked by an emphasis on what were 
seen as unifying concepts of mathematics—set, relation, function, and the 
like—coupled with the abstract structures—groups, rings, fields, vector 
spaces—into which they are organized. “Because the university mathematicians 
who dominated the modern mathematics movement tended to be specialists in 
pure rather than applied mathematics, they saw pure mathematics, with an 
emphasis on set theory and axiomatics, not only as the content that was missing 
from the school curriculum but also as providing the framework around which to 
reorganize that curriculum” [26, p. 412]. 

Again, the reaction was swift. Morris Kline was the first and loudest voice, 
arguing that aspects of the reform efforts were “wholly misguided,’ ‘sheer non- 
sense,’ attempts to replace the ‘fruitful and rich essence of mathematics’ with 
sterile, peripheral, pedantic details’ [quoted in 6, p. 55]. In the position paper “On 
the Mathematics Curriculum of the High School” [16], Kline and 64 other mathe- 
maticians offered a more measured critique—essentially arguing that anyone 
attempting reform needed to link school mathematics more closely to its history 
and to concrete applications and not to make it so abstract and formal that future 
nonmathematicians would be turned away. An important feature of the paper was 
that it offered “fundamental principles and practical guidelines.” E. G. Begle [3], 
director of the School Mathematics Study Group, the largest and most prominent 
of the new math curriculum reform projects, expressed delight with the guidelines, 
claiming that most were reflected in the new textbooks, and then gently chided the 
authors for failing to distinguish among the different projects and their suggestions 
for curriculum improvement, thus effectively rejecting them all. 

Once most of the new math projects had ended, Kline fired the last shot. In 
Why Johnny Can’t Add: The Failure of the New Math [12], published in 1973, he 
reiterated and elaborated his opposition to the reform. Although the book was 
marred by a sometimes flippant tone and a persistent unwillingness to make 
distinctions among reform efforts, Kline offered cogent thoughts on deduction, 
rigor, and the language of mathematics. (Despite the book’s title, it dealt with the 
secondary curriculum and not the teaching of arithmetic. Kline once confided that 
his publisher insisted on the title.) He ended the book by arguing that the 
appropriate direction for any reform “should be diametrically opposite to that 
taken by the new mathematics” [12, p. 144], toward mathematics as an integral part 
of a liberal education, with connections to culture, history, science, and other 
subjects. He cited with approval Moore’s [15] call to combine mathematics with 
science in high school and to reduce the artificial separation between its pure and 
applied sides. Thus, Moore, who had pushed the earlier reform effort, was cast in 
opposition to the second. 

The residue of the new math era may be difficult to see in today’s school 
mathematics, but it is there. The precalculus course, for example, is a direct 
descendent of the elementary functions and introductory analysis courses that 
appeared during that time. Some of the new math’s terminology and notation has 
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disappeared, but much survives. And various topics, such as inequalities, that the 
reforms introduced into school mathematics have remained. With respect to 
changes in the way mathematics has been taught, few of the reform proposals 
appear to have been extensively implemented. It 1s popular to declare that the new 
math was tried and that it failed. Studies of school practice at that time [8], [17], 
however, suggested that in most classrooms the reforms were never really tried. 


A STANDARDS-BASED CURRICULUM. Over the past decade or so, the reform 
impulse has heated up once more, this time led by professional organizations 
under the banner of raising expectations and providing mathematical literacy for 
all [19], [20], [21], [22]. The publication A Nation at Risk [18] set many of the terms 
of the discourse: low performance on international assessments threatened the 
nation’s economic competitiveness; declining test scores nationally meant that 
rigorous and measurable standards were needed. Reformers of school mathematics 
have argued that changes in society and within mathematics itself necessitate a 
more demanding school mathematics curriculum. The goal is to develop students’ 
mathematical power: ““Truth and beauty, utility and application frame the study of 
mathematics like the muses of Greek theater. Together, they define mathematical 
power, the objective of mathematics education” [22, p. 43]. 

Much of the leadership in promoting reform has come from the National 
Council of Teachers of Mathematics (NCTM). This organization, founded in 1920 
to help preserve the place of mathematics in the secondary curriculum, supported 
but did not lead the new math reform efforts. Two decades ago, however, it began 
to play a more active role as a national voice for teachers, in part as a response to 
the widely perceived failure to change school mathematics during the new math 
era and in part to counter the ensuing “back to basics” backlash of the mid-1970s 
[14, pp. 22-25]. In its first, and most influential, reform document [19], the NCTM 
took the term standard from the rhetoric of raised expectations and accountability 
for results and made it a statement for judging the quality of school mathematics 
and for providing ‘“‘an informed vision of the future” [14, p. 36]. 

The language of ‘mathematical power” represents an attempt to provide a 
vision of “what it means to be mathematically literate both in a world that relies 
on calculators and computers to carry out mathematical procedures and in a world 
where mathematics is rapidly growing and is extensively being applied in diverse 
fields” [19, p. 1]. The arguments given for reforming the school mathematics 
curriculum, instruction, and assessment rest on the contention that because “all 
industrialized countries have experienced a shift from an industrial to an informa- 
tion society,” the mathematics that students need to know in order to be “self- 
fulfilled, productive citizens in the next century” [19, p. 3] has also changed. The 
changes in society have demanded that schools change as well. Although previous 
reform efforts had their effects, virtually all observers of U.S. school mathematics 
classrooms have come away convinced that change is needed. 

A large part of the standards-based reform is built on the view that mathematics 
itself has become more computational and less formal. “In recent years, a reaction 
against formalism has been growing. In recent mathematical research, there is a 
turn toward the concrete and the applicable. In texts and treatises, there 1s more 
respect for examples, less strictness in formal exposition” [5, p. 344]. Even before 
the recent controversy over “the death of proof” [1], [7], [10], so-called informal 
geometry courses, minus proof, were being introduced into high schools as state 
legislatures and school districts mandated “geometry for all.” For some high 
school teachers, the call for “decreased attention” to “Euclidean geometry as a 
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complete axiomatic system” and ‘“two-column proofs” [19, p. 127] has been 
interpreted as permission to do away with proof altogether, for everyone [14, 
p. 118]. (For an eloquent defense of proof in high school geometry, together with a 
proposed year-long syllabus, see [32].) And clearly, the availability of computer 
software and graphing calculators has made it easier than ever before to visualize 
relationships and test numerous cases of a generalization before or in place of 
providing deductive justifications. 

This time around, the negative reaction to reform proposals and activities has 
been slow to come. The lag may be because endorsements were sought and 
received for the proposals from all parts of the mathematical community. Or it 
may be because the proposals were framed in rather general terms, with textbooks 
and other materials from reform projects appearing only in the last few years. 
Opposition to the reform, however, has been building—much of it on the Web—for 
some time, and now we have an article in print by H. Wu [33], one of the most 
outspoken of the critics. 


WU’S CRITIQUE. Like many before him, Wu is not concerned with making 
distinctions among reform proposals or between the proposals and the various 
activities carried out and materials developed in the name of reform. He conflates 
what he terms the K-/2 mathematics education reform promoted by the NCTM [19], 
[20], [21] and the calculus reform stimulated largely by work of the MAA and the 
National Academy of Sciences [27]. Although these two efforts share many com- 
mon features, they have rather different agendas—with one attempting to lay out 
a broad framework for the mathematics all American schoolchildren need to know 
and be able to do and the other targeting a college course seen as unsatisfactory 
and out-of-date. 

Wu’s scattershot approach to the K-12 reform relies heavily for its force on 
unsubstantiated claims and random anecdotes. Contending that “the reform must 
be judged by its performance and not by its rhetoric” [33, p. 947], he offers no 
documentation of performance whatsoever. Instead, he often uses the rhetoric of 
the NCTM standards documents and the content of textbooks purporting to follow 
the standards to support his assertions that performance must be bad. 

He also hits some inappropriate targets. For example, he castigates two high 
school teachers [24] writing in the Mathematics Teacher for their attempt to help 
Students see the functions involved in a trigonometric identity before establishing 
its validity. Quite apart from whether every article in an official journal of an 
Organization promoting reform must reflect reform views, one can reasonably ask 
whether seeing the graphs of these functions alone might not help students 
understand the identity. And how can Wu be so certain that teachers who are 
having students use graphing calculators are neglecting proof just because it is not 
mentioned in the article? 

A_ second example is Wu’s [33, p. 949] disapproval of textbooks that neglect 
“basic formulas.” He cites the precalculus textbook produced by teachers at the 
North Carolina School of Science and Mathematics [2], a book whose origins were 
independent of, although ultimately in harmony with, the NCTM reform. The 
North Carolina approach relied much more than anything NCTM has proposed on 
the view that every topic ought to be introduced with an application [11, p. 154]. 
Wu notes that in its discussion of radian measure [2, pp. 209-210], the North 
Carolina textbook fails to give the formula relating degrees and radians but instead 
leaves it to the exercises. The issue here, as any textbook author will recognize, is 
the tension between textbook as archive and textbook as tool for learning. Once a 
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formula is put into a text for memorization and subsequent reference, there is little 
point in asking the reader to find it. The omission of a formula from a textbook, 
reform or otherwise, should not be interpreted to imply that teachers will slight it 
in the classroom. Is there no value in having students develop a formula on their 
own? Must it always be given in the textbook? Wu is on much firmer ground 
elsewhere when he discusses problems associated with the use of open-ended 
problems [31]. The examples he gives clearly show that the authors of such 
problems have not always thought through the mathematics needed to solve them 
and the reasons for having students work on them. 

A final example of Wu’s indiscriminate approach is his criticism of the NCTM 
standards [19] for presumably arguing that information about the nature and 
existence Of polynomial roots should be withheld from most high school students 
[33, p. 948]. The reference is to a discussion of how the solution of polynomial 
equations might be “differentiated in both depth and the level of formalism” [19, 
p. 152] by treating it at any of five levels. Nowhere does the document say that as a 
rule students should not learn about polynomial roots. What the discussion 
attempts is an illustration of how teachers might begin with exploration rather than 
abstraction, depending on the students’ knowledge and experience. Wu sees this 
approach as advocacy of a utilitarian curriculum that never reaches “mathematical 
closure” (i.e., formal proof). He is right that the arguments for a standards-based 
curriculum are largely utilitarian, but he wrongly attributes the apparent decline in 
attention to proof to the standards movement alone, and he makes the same error 
some teachers do in interpreting a call for decreased attention to certain proof 
practices as sanctioning the complete elimination of formal proof itself. 

The conception of the learner underlying the NCTM standards has usually been 
characterized as constructivism (14, p. 113], a term that has almost lost its meaning 
in American mathematics education. It began as the epistemological position, 
associated with Jean Piaget, that a learner both incorporates novel experience into 
existing mental structures (assimilation) and reorganizes those structures to handle 
more problematic experience (accommodation). Later interpreters stressed the 
accommodation aspect, arguing that learners actively construct knowledge rather 
than receiving it passively from the environment. The most radical view, which has 
become popular in some quarters of American mathematics education but almost 
nowhere else, is that the learner is an informationally closed system that cannot 
know an independent, pre-existing external world [29], [30]. 

As a theory of knowledge acquisition, constructivism says nothing about how 
teaching or instruction should proceed. In recent years, however, practices that 
encourage students to become active learners by conducting investigations, work- 
ing in groups, and handling concrete objects have come to be characterized as 
“constructivist teaching.” Only if educators all the way back to Plato—including 
Comenius, Pestalozzi, Herbart, Froebel, Dewey, and Montessori—were to be 
considered constructivists would such practices uniquely define constructivist 
teaching. Overblown claims have been made that radical constructivism has brought 
‘a new revolution in mathematics education of a magnitude no less than the 
modern mathematics movement of the 1960s” [28, p. 720] and has provided the 
epistemology underlying the current reform, but there is no real evidence for these 
assertions. What is clear, however, is that the reform documents advocate some 
pedagogical practices that 50 years ago might have been labeled “progressive,” but 
that today are termed “constructivist.” 

In his critique, Wu condemns “the new pedagogy, which relies heavily on 
constructivistic instructional strategies, such as cooperative learning [a form of 
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group work not directly connected to the standards-based reform] and the discov- 
ery method [apparently any instructional approach in which students engage in 
inquiry]’” [33, p. 949]. He claims that “too much of this is happening in the reform 
classrooms to the detriment of good education” [33, p. 950]. It is impossible to 
know, in the absence of any data, what constitutes heavy reliance or “too much.” 

Wu cites an article [9] that he says laments the absence of large-scale studies 
supporting “unrestricted” (his term) applications of cooperative learning. Al- 
though the article does not actually make that assertion, a more critical point is 
that research, no matter how large in scale, can never justify the indiscriminate use 
of a teaching method, traditional or otherwise. Methods are always interpreted 
and used in different ways by different teachers for teaching different topics to 
different students. A new method can no more be shown universally superior than 
can traditional methods, whatever they might be. 

The most constructive part of Wu’s critique, by far, is the final section in which 
he urges mathematicians to become more involved in mathematics education, 
contributing ideas to the revision of the NCTM standards documents, helping to 
improve the training of prospective teachers, and participating directly in curricu- 
lum change. Many mathematicians have already been involved in the current 
reforms for some time, but greater participation—encouraging or critical—can 
only be beneficial. 

To progress as a field in how we deal with efforts to improve school mathemat- 
ics, however, we need not only greater participation but also a higher level of 
discourse about those efforts. Critiques need to be based on substantive analyses 
that are grounded in evidence. They should consist of more than capricious 
assertions and bleak prophesies. We need to move from anecdote to analysis, from 
evisceration to evidence, from diatribe to dialogue. 


SUPPORTING ONE ANOTHER. Over the past century, the American mathemat- 
ical community has become one of the most cohesive academic communities in the 
world. Few disciplines anywhere, for example, have organizations such as the 
Conference Board of the Mathematical Sciences or the Mathematical Sciences 
Education Board to unite all elements of the community, from elementary school 
teachers to applied mathematicians working in industry. At conferences bringing 
together selected teachers, professors, and other scholars from across the conti- 
nent to discuss education problems, those concerned with mathematics are almost 
invariably the first to coalesce, with many already well acquainted with one 
another. Since the American mathematical research community emerged over a 
century ago [23], American mathematics education has benefited from a virtually 
continual stream of support from prominent research mathematicians who have 
taken an interest in education, been willing to speak out for it publicly, and helped 
work for educational change. 

Despite the cohesiveness, however, strains appear from time to time when the 
schop| mathematics curriculum is under scrutiny. These strains can be expressed as 
polar opposites, although of course there is always a spectrum of opinion in 
between. Some favor pure mathematics; others applied. Some want mathematics 
taught as they learned it; others want a different approach. Some are concerned 
primarily with developing the next generation of mathematicians; others are 
concerned primarily with mathematical literacy for all. For some, the deductive 
side of mathematics is what counts; others prefer the empirical, fallibilist, cultur- 
ally determined side. 
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Whenever the mathematics taught in schools seems especially removed from 
mathematics seen as a scientific discipline and human enterprise, the strains can 
become especially great. That has happened at the beginning of the century, at 
mid-century, and now as the century draws to a close. The tension that these 
disagreements entail should remind us that if we did not share so much in 
common, we would not have such good grounds on which to disagree and to work 
toward a resolution. 

Change in education is notoriously complex, difficult, and unpredictable. Re- 
form movements in mathematics education turn out neither as advocates hope nor 
as detractors fear. But these movements can energize those teachers who want, as 
Begle once put it, to teach better mathematics and to teach mathematics better. As 
teachers struggle to improve their practice, a reform vision can provide needed 
direction, and membership in a mathematical community can provide needed 
support. 
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NOTES 


Edited by Jimmie D. Lawson and William Adkins 


A Shorter Proof of the Ramanujan 
Congruence Modulo 5 


John L. Drost 


A partition of a natural number 7 is a way of writing n as an unordered sum of 
natural numbers. The partition function p(n) counts the number of partitions of 7. 
For example, p(4) = 5, with 4,3 +1,2+2,2+1+1,1+1+41+1 being the 
five partitions of 4. The extraordinary Ramanujan at first noticed, then proved, 
some very interesting congruence properties of p(7) modulo the primes 5, 7, and 
11. In fact, he generalized them to congruences modulo powers of those primes. 
For the prime 5, Ramanujan’s congruence is the following: 


Theorem 1. Jf m is a nonnegative integer, then p(Sm + 4) = 0 (mod 5). 


The first few cases are p(4) = 5, p(9) = 30, p(14) = 135. The shortest proof of 
Theorem 1 in the literature can be found in [1; pp. 176-177] or [2; pp. 287-289]. 
They use both Euler’s pentagonal number formula (for the inverse of the generat- 
ing function of p(7)), and the following identity, due to Jacobi: 


[Cl - x). x*)(1 x) ]? 
=1-—3x4+ 5x3 — 7x8 te +(-1)"(2n + 1) 92 +e 1) 


The congruence is then deduced by working in the power series ring modulo 5. A 
combinatorial proof of (1), using the Involution Principle, can be found in [3]; the 
classical proof uses the Jacobi triple product formula. 

The proof given here is similar, but requires only Jacobi’s identity and the 
binomial theorem. Along the way, a congruence result involving the square of the 
partition generating function is shown. 


Proof‘of Theorem 1: The partition function has corresponding generating function 
given by P(x) = [U1 — x») -— x7) — x89) Jo) = 1 +x + 2x7 + 3x3 + 
+ +p(n)x" + ++. Let P(x) be [P(x)}* and let the x” coefficient of P,(x) be 
p,{n). The first step in the proof is to show that p,(5m + i) = 0 (mod5S) for all 
m > 0, and i = 2, 3, or 4. To see this, note that the left side of (1) is P_,(x) from 
the product expansion of P(x), and that P(x) = P_,(x)P,(x) = P_,(x)P(x°) 
(mod 5). In the congruence we are using the fact, which follows from the binomial 
theorem, that for all power series f(x),[f(x)P = f(x?) (mod 5). Expanding this last 
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product gives 
P(x) = (1 — 3x — 7x® + 9x"? — 11x? ++) 
x (1 +x° + 2x! + 3x +--+) (mod5). (2) 


The exponents of the first factor are all 0 or 1 mod5, and the multiplication by the 
second factor does not change this. So the coefficients p,(m) mod5 are nonzero 
only if n =0 or 1 (mod5). In terms of power series, P,(x) = r(x) + xs(x°) 
(mod 5) for some power series r(x), s(x). To finish the proof, take the cube of both 
sides of the preceding congruence. This gives 


P.(x) = r(x5)° + 3xr( x5)" s( x5) + 3x2r(x5)5( x5)" + x35(x5)° (mod 5). 


Since r(x°) and s(x°) are power series in x°, this implies that the x°”** 


coefficients of P,(x) are divisible by 5. Multiplying this last congruence by P_.(x), 
which is another power series in x° mod 5, does not alter this. a 


This proof was motivated by the classical one of the Ramanujan congruence 
mod 7, i.e., p(7m + 5) = 0 (mod 7). In that case one multiplies P_,(x) by P,(x). 
Modulo 7, one gets P,(x)-= t + xu + x%v, where t, u,v are all power series in x’. 
Squaring this gives P,(x) having coefficients of x’"*° divisible by 7, which again 
won’t change upon multiplying by P_,(x). 

Alternate proofs can be obtained by replacing the cubing in the mod 5 case and 
the squaring in the mod7 case by taking the square roots and fourth roots, 
respectively, of the formal power series. This gives the function P(x) directly; 
however, there must now be an additional argument to show that the binomial 
coefficients generated (or the products of binomial coefficients in the mod 7 case) 
are 0 in the appropriate modulus. 
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An Amusing Representation of _~ 
sin x 


Scott Ahlgren, Lars English, and Ron Winters 


In the course of their work [1] on the quantum mechanics of a simple harmonic 
oscillator, the latter two authors were confronted with a limit of products of partial 
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approximates of a continued fraction. With the aid of the symbolic manipulator 
Maple, they discovered that the quantity in question was nothing other than 
x/sin x. In this Note we give a derivation of the resulting formula, which may be of 
interest to mathematicians and physicists alike. 

When x is a non-zero real number and n is a positive integer with n > |x|/v2, 
define y(n,x) =2-—x7/n*. Set Cy =0 and C, =1/y(n,x); define C, = 
(y(n, x) — C,_,)~' for p = 2,3,..., so that 


1 
C, = 1 
¥ (1, x) = 7 
y(1, x) 
p times 


If x is not an integer multiple of 7, we claim that 


lim n[[C, = (1) 


Our proof uses a potpourri of techniques from continued fractions, recurrence 
sequences, and calculus. To begin, set A, = 0 and A, = 1, and define A, = 
y(n, x)A,_, — A,-2 for p = 2,3,... . An induction shows that C, = A,/A,,,, $0 
the product in (1) collapses, and the left side becomes simply lim, _,,.n/A,. 


Let t= 3(y(n, x) + Vv (y(1, x) — 4) and 7 = 4(y(n, x) - y(n, x) — 4) 


be the roots of the polynomial 
X* — y(n, x)X +1. (2) 


Since 0 < y(n, x) < 2, we see that + # 7 and that 7 is the complex conjugate of 7. 
Using (2) and induction on p, we find that A, = (7? — 7?)/(7 — 7), and our limit 
becomes 


on _ nt-T) 
lim — = lm aa: (3) 
nox A, nox T —F 
Notice that 
x? i|x| x? 
T=1- aa t+ aT -— a; 
2n 2n n 


from which we obtain jim nit — 7) = jim i|x|(4 —x?/n)'/* = 2il|x|. Using 
l’Hopital’s rule, we find that lim n log 7 = ilxl, whence lim 7” = e''*'. Therefore 


nox nox 
the limit in (3) is 


2i|\x| x 


etl — ew sin x 


Notice that when x = 0 we have A, =n; in this case the left side of (1) equals 1, 
as expected. 
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a subset of 4 such that the complement of 4 isa 
the ball 4 is in an epsilon closed Set, so that 4 can 
neighborhood of g, and if not be in 4 and is 

A contains its derived set therefore unplayable!! 

of lim. pts., then 4 is 


(Vs AN 


vie YTD 


eee ee eee 
| understand that the new wr FEE 


education requirement has _ - 
an extended mathematics Go state 
component... 
° Ban 
TTT 


_. QHE REAL 
== CFIELD 


WAAARRRASR 
AA AAA AAR ARAAAAAAAS 
LwWaRWARARAKAR ARAN AIA UT 
RARARARARARAR ALA 2 Af Rite a. 
([DLLADALBADLAAADOSE fone 


Contributed by Russ Hood, Rio Linda, CA 


966 NOTES [December 


PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Duane M. Broline, Ezra A. Brown, Richard T. Bumby, 
Underwood Dudley, Michael A. Filaseta, Ira M. Gessel, Bart Goddard, Jerrold R. Griggs, Douglas 
A. Hensley, John R. Isbell, Robert Israel, Murray S. Klamkin, Daniel J. Kleitman, Fred Kochman, 
Frederick W. Luttmann, Frank B. Miles, Richard Pfiefer, Leonard Smiley, John Henry Steelman, 
Kenneth Stolarsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before May 31, 1998; Additional information, such as generalizations and refer- 


ences, is welcome. The problem number and the solver’s name and address should 
appear on each solution. An acknowledgement will be sent only if a mailing label 
is provided. An asterisk (*) after the number of a problem or a part of a problem 
indicates that no solution ts currently available. 


PROBLEMS 


10627. Proposed by George E. Andrews, The Pennsylvania State University, University 
Park, PA. The Rogers-Ramanujan (RR) partitions of an integer are those that have no 
repetitions and no consecutive integers as parts. The RR’ partitions are those RR partitions 
that have no 1’s. 

(a) For n > 1, prove that at least half of the RR partitions of n are RR’ partitions. 

(b) Let Q(n) denote the number of RR’ partitions of n into at least two parts whose two 
largest parts differ by at most 2 more than the number of parts. For example, Q(12) = 3 
because, of the nine RR partitions of 12, six are RR’ partitions, and of these only three (8 +4, 
7+ 5, and 6 + 4 + 2) meet the stated condition. For n > 1, prove that Q(n) equals the 
difference between twice the number of RR’ partitions of n and the number of RR partitions 
of n. 


10628. Proposed by George E. Andrews, The Pennsylvania State University, University 
Park, PA. Let pa,p(n) denote the number of partitions of n that contain no parts of size a or 
b. For n > 0, prove that 


Yo (-1! Bj.2; (x — ee) = 0. 


jz! 


For example, when n = 9 the assertion is — p;,2(9) + p2,4(6) — p3.6(0) = 0, which is 
true because p;.2(9) = 4 (the relevant partitions are 9, 6+3, 5+4, 3+3+3), p2,.4(6) = 5 (the 
relevant partitions are 6, 5+1, 3+3, 3+1+1+1, 1+1+1+1+1+1), and p36(0) = 1 (the empty 
partition of 0 satisfies the condition). 


10629. Proposed by Frank Schmidt, Arlington, VA. Let p(n) denote the number of partitions 
of the integer n, and let f(n) denote the number of partitions 1; + Az +43 +--- satisfying 
Ay > Ag >Az3>---andn =A; +A3+A5+---. Forexample, p(5) counts the 7 partitions 
5,441,342,3+14+1,2+24+1,2+1+1+41,and1+1+1+1+41, and f(5) 
counts the 7 partitions 5,5 + 1,5+2,5+3,5+4,4+3+1, and4+2-+ 1. Prove that 
p(n) = f(n) for every positive integer n. 
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10630. Proposed by Richard Stong, Rice University, Houston, TX. It is possible to show 
that csc(32 /29) — csc(102/29) = 1.999989433.... Prove that there are no integers j,k,n 
with n odd satisfying csc(jm/n) — csc(ka/n) = 2. 


10631. Proposed by Greg Huber, University of Chicago, Chicago, IL. Given a triangle T, 
let the intriangle of T be the triangle whose vertices are the points where the circle inscribed 
in T touches T. Given a triangle 7p, form a sequence of triangles 7p, 7;, 72, ... in which 
each 7,4) is the intriangle of 7,,. Let d, be the distance between the incenters of 7, and 
Tn+1- Find limy—oo dn41/dn when Tp is not equilateral. 


10632. Proposed by William F- Trench, Trinity University, San Antonio, TX. For given 
nonnegative integers m and n, evaluate 


reses —(7)a- aap? (—1)* (jy 
— ees m+k+1\k 


10633. Proposed by Kiran S. Kedlaya, Princeton University, Princeton, NJ. Let S be a 
commuting family of n-by-n matrices over an arbitrary field. Suppose the matrices in M 
have a common eigenvector v, so that Mv = Ayu for all M € S. Prove that the transposes 
of these matrices also have a common eigenvector with these eigenvalues, that is, a vector 
w satisfying M? w = Ayw forall M € S. 


SOLUTIONS 


A Partial Comparison Test for Divergence 


10412 [1994, 911]. Proposed by Donald A. Darling, Newport Beach, CA. Find necessary 
and sufficient conditions on a nonincreasing sequence a), a2,... of positive real numbers 
so that, if b}, bz, ... iS a nonincreasing sequence with by > a, for infinitely many k, then 


>= bn = ©. 


Solution by John H. Lindsey II, Fort Myers, FL. We need not assume the a’s are nonincreas- 
ing. The sought condition is lim inf nay, > 0. 

Suppose lim inf na, = 0. Let no = 0. Pick ny with nja,, < 1/2, and then for each 
k > 1 pick ny > my, with nyap, < min (nk—14n,_,,27* ). For np_| <n < nx define 
bn = Gn,. The b’s are nonincreasing, and 


Ye => Sa m= Deum an, = oman = 2 = 


n=] k=l] n=n,-\+1 


Suppose lim inf na, = 2c > O. Then there exists N so thatn > N implies nay > c. 
If the b’s satisfy the given conditions, we can define a sequence nx with no = 0, n, > N, 
nk > 2ng—, fork > 1, and bp, > an,. Then 


a OO 
Sine > 3 bn = Dm — me Dam, > Do 5M, = 56 = — 
n=] k=] 


k=1 n=n,_-\+1 


Solved also by D. W. Bailey, R. Barbara (Lebanon), P. Budney, R. J. Chapman (U. K.), E. Hertz, R. Holzsager, M. Hudel- 
son, N. Komanda, O. P. Lossers (The Netherlands), D. Marcus, A. Meir (Canada), H. Morris, G. Myerson (Australia), 
V. Novakov (Bulgaria), A. Pedersen (Denmark), C. G. Petalas & T. P. Vidalis (Greece), M. Reid, R. M. Robinson, 
K. Schilling, R. Stong, P. Szeptycki, A. A. Tarabay (Lebanon), D. R. Witte, A. N. ’t Woord (The Netherlands), NSA 
Problems Group, and the proposer. 
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A Singular Inequality 


10421 [1994, 1014]. Proposed by Gigel Militaru, University of Bucharest, Bucharest, 
Romania. Let n be an integer, n > 3, and let z;,..., Z, andt),..., tf, be complex numbers. 
Prove that there exists an integer i with 1 <i <n such that 4|z;t;| < i= zit; + Zt |. 


Solution by O. P. Lossers, University of Technology, Eindhoven, The Netherlands. We prove 
the following generalization: 


Proposition. Leta, a al and pb, Leng bp? (iJ =1,...,k; k < n) be complex num- 
bers. Then there exists an integeri with 1 <i <n such that 


n 
(1), ) (k) ) (k) y } (1), (1) (k) 7 (k) 
j=l 


Proof. Let M be the n x n matrix defined by M;° = (a\b”) and define M = a M®, 


The matrix M“) has rank at most one, so rank(M) <k <n. Being singular, M cannot be 
“diagonally dominant”, so for some i we must have |Mji| < >> jd |Mi;| or, removing the 
restriction on j, 2|Miji| < -j—=,|Mi;|, which gives the result. Oo 

The desired inequality is obtained from the Proposition by taking k = 2, a “) = p? 


and a” = p? = ty. 


= Zi» 


Editorial comment. For a proof of this property of diagonally dominant matrices, and 
more, see Roger A. Horn & Charles R. Johnson, Matrix Analysis, Cambridge, 1985, Theo- 
rem 6.1.10, p. 349. 


Solved also by D. Beckwith, R. J. Chapman (U. K.), J. H. Lindsey II, R. Vermes (Canada), and the proposer. 


Self-sorting in Tournaments 


10447 [1995, 360]. Proposed by Stephen C. Locke, Florida Atlantic University, Boca Raton, 
FL. Consider a tournament in which every pair of teams played a match that one of the two 
won. Let Lo be a listing of the teams in some order, and define successive L;,i = 1,2,3,... 
by repeated application of the following operation: if a team in the list L; lost to the team 
immediately following it in the list, call that pair of teams a switchable pair; the order of 
one switchable pair is then reversed to give L;4;. Note that this may increase the number 
of switchable pairs. 

Prove that any such sequence of operations leads, in a finite number of steps, to a list 
in which every team defeated the team immediately following it in the list, so there are no 
switchable pairs. 


Composite solution by A. N. ’t Woord, University of Technology, Eindhoven, The Nether- 
lands and Jerrold R. Griggs, University of South Carolina, Columbia, SC. Let a; denote 
the number of pairs (A, B) where team A lost to team B, but team A precedes team B in 
list L;. Then aj4; = a; — 1. Since all a; are nonnegative integers, the process described 


must lead, after at most (6) steps, to a list without switchable pairs. The maximum number 


of switches, (6); is required if and only if we start with the reversed transitive tournament 


in which team j lost to team k for all j,k with j < k. 


Solved also by D. Beckwith, K. L. Bernstein, R. J. Chapman (U. K.), B. Dawson, R. Ehrenborg & F. Fares (Canada), 
K. Foltz, S. M. Gagola Jr., F Galvin, J. W. Grossman & R. S. Zeitman, C. Hillar, G. Isaak, N. Komanda, J. H. Lindsey II, 
J. B. Muskat (Israel), A. Nijenhuis, A. Pedersen (Denmark), P. J. Slater, J. H. Steelman, R. Stong, Anchorage Math 
Solutions Group, NSA Problems Group, Oklahoma State Problems Group, WMC Problems Group, and the proposer. 
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Polynomial Divisibility 


10452 [1995, 463]. Proposed by Seung-Jin Bang, Ajou University, Suwon, Korea. Find all 
values of n, k, a, and b (n and k positive integers, n > k, a and b nonzero real numbers) for 
which the polynomial x” + ax + b is divisible by x* + ax + b in R[x]. 


Composite solution by Gerald A. Heuer, Concordia College, Moorhead, MN, and Roger B. 
Eggleton, Illinois State University, Normal, IL. The values are 

(i) k = 1, a = —1, n arbitrary; 

(ii)k = 1,a 4-1, b = —(a +1), n even; 

(iii) kK = 1,a 4 -—1,b=+(a+1),n odd; 

(iv) k = 2,b=1,a = —2cos (2rm/(n — 2)) for some integer r with 1 < r < [(n — 3)/2] 
andr #£(n —2)/4,n =Sorn> 7. 

Let P(x) = x* + ax +b. Note that P(x) divides x” + ax + b if and only if it divides 
x" tax +b— P(x) = x" — x* = (x"-* — 1)x*. Since b ¥ 0, this occurs when P(x) 
divides Q(x) = x"~* — 1. 

We first consider the case k = 1. If a = —1, then P(x) = b, which divides Q(x) for 
all b # 0. Otherwise, P(x) = (a+ 1)x +b =(a+1)(x+b/(a +1)), so P(x) divides 
Q(x) if and only if ¢ = —b/(a + 1) satisfies ¢”~! = 1. Since ¢ is real, this occurs when 
¢ = 1 or when¢ = —1 and n — 1 is even. This explains cases (i)—(iii). 

Now suppose k > 1 and P(x) divides Q(x). Since P(x) is real, its nonreal roots come 
in complex conjugate pairs, so its irreducible factors over R can be only x — 1, x + 1, or 


(x _ e2rmi/(n—k) )(x _ e Arm link) — _ x2 —Ix cos 2 arn t+ 1 


for some r withO <r < (n —k)/2. Note that x — 1 is a negative-reciprocal polynomial 
and all other possible irreducible factors are reciprocal polynomials, so P(x) is either a 
reciprocal polynomial or a negative-reciprocal polynomial. It follows that 


x* tax+b= P(x) = +x" P(1/x) = £(bx* + ax*! +1). 


Since a # 0, comparing coefficients of x shows that k = 2 and the plus sign occurs. Now, 
comparing coefficients of x* shows that b = 1. Since Q(x) has no multiple roots, P(x) 
cannot be (x — 1)* or (x + 1)?, so P(x) must be x* —2x cos(2rm/(n —k)) + 1 for somer 
satisfying 1 <r < |[(m — 3)/2]. Note that r = (n — 2)/4 gives a = 0, which is forbidden. 
All other values of r give solutions. There are no allowable values of r unless n = 5 or 
n> 7. 

Solved also by M.-Th. Antoine (France), R. Barbara (Lebanon), M. Benedicty, D. Callan, R. J. Chapman (U. K.), 
F. J. Flanigan, Z. Franco, S. M. Gagola Jr, A. Gunawardena, N. Komanda, J. H. Lindsey II, O. P. Lossers (The 


Netherlands), A. Pedersen (Denmark), Y. Wang, A. N. ’t Woord (The Netherlands), Anchorage Math Solutions Group, 
NSA Problems Group, and the proposer. 


More On Stirling’s Approximation 


10521 [1996, 347]. Proposed by D. M. Bloom & G. W. Booth, Brooklyn College, CUNY, 
Brooklyn, NY. Let Sn = = (2mn)!/2(n/e)" and T, =n! /Sn. 

(a) Prove that 7, — 1 = 1/(12n — a,), where O < ay < 5 for all positive integers n. 

(b) Prove that the sequence (a) is monotonically 1 increasing. 

(c)* If b, = n(1/2 — an) for all n € N, is the sequence (b) monotonically increasing? 


Solution by Richard Stong, Rice University, Houston, TX. The sequence by, is increasing. 
This, as well as parts (a) and (b), follows from the asymptotics of the gamma function carried 
through with error bounds. For these error bounds, we use the following convention: Any 
appearance of the letter 0 or t in a formula means that the formula holds if 6 is replaced by 
some number (possibly different for each appearance) in (0, 1) and t is replaced by some 
number in (—1, 1). 
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Note that log(7T;, /Tn41) = (n + 1/2) log(1 + 1/n) — 1. Stirling’s formula ensures that 
T, — 1, so we have T, = exp(Rn), where Ry = Spo, ((k + 1/2) log(1 + 1/k) — 1). 
Expanding in powers of 1/k gives 


1 /l 1 _ 1 (3- 1 
a(z-ma) 360 \k3 ca) 
1 1 l 1 1 
+ aaa (is ~ eas) ~ (+3) eC +z) 1) 


(© @) 
=) \(-1)’ pal’ alo )ta- eae 
1260\ 4 360\ 2 12 2r(r +1) 
1 1 , 
= 240K8 ~ 60K (1) 
(The coefficients have been chosen to cancel the terms with r < 7.) The summands in (1) 
alternate in sign, converge to zero, and are decreasing in magnitude for k > 4. Therefore, 


ca Vie (142) 1- 2 (2 1/1 1 
_ @) — _ = ——_ —_ — —= ———_— 
2) °F k 2\k k+1) 300\0 k+D; 


l l 1 ) 
+oal(g- cep) ~ Hao 
Summing from n to oo gives, for n> 4, 
l l l 116 


~ Tan 36003 * 126005 6720n7’ 
where we obtain the last term from the upper bound 


(2) 


Rn 


s l c n | °° d l in l c 11 
— ax = —— —__—___ ———_ , 
240k — 240n8 n 240x® 240n® = 1680n’? — 6720n’ 


k=n 
The terms in (2) are decreasing in magnitude for n > 4, so we can truncate at any step to 
get an inequality involving R,. For n > 4 we get 1/(13n) < R, < 1/(12n), and using this 
to bound derivatives gives 


w(t __@ \V__ 1} 8 
"“"\12n 360n3) ~~ 1728n3 ~—-17280n5 ’ 


; 9 \? 9 
R> =({(—) =———_.., 

” 12n 248832n5 

1 12n n 18590 
Ry,  1—1/(0n?) +1/(105n*) — 6720n5 


ton (1+ 1 1 +( 1 1 “16 1 1 \? A 18596 
= n —_—_— — —_—_ = — 
30n2—-:105n4 30n2-:105n4 30n2s-:105n4 6720n5 


2 53 4 4 0 18596 


5n  525n3 = 525 + 3675n! r 2250n5 t 6720n°> 
13 2 53 0.269t 
= tent 5n  525n3 t nm 
From E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, Cambridge 
University Press, Cambridge, 1927, Equations 7.2 and 13.151, we have 


1 1 1 1 R, R R> 


and 


T,-1 e%»—-1 R, 2. 12 720 30240 
1 1 QE)" !¢Qm) ., 
= at dL aimcigam Rn” (3) 


m=} 
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The terms in (3) alternate in sign, tend to zero, and (since the zeta function is decreasing 
and R, < 2) are decreasing. Therefore we can truncate (3) to get inequalities. Hence we 


see 7 
1 ~ 11 Rn Rr R36 
T, — 1 R, 2 12 £41720 # 30240° 
Plugging in our formulas for powers of R, gives 


1 oni 4 2 4406147 4 O27t 
Tn — 1 2 720n 43545600n3 n° 
To prove parts (a) and (b), note that (4) translates into 
l 293 = 0.120 


2  720n + nm 


(4) 


an = 
for n > 4, so 
_ 293 4 0.127 
~ 720n(n + 1) n3 
One also calculates a} = 0.1569..., ag = 0.3073..., a3 = 0.3678..., and ag = 


0.3997.... Therefore, we see that the a, are increasing for all nm and stay between 0 
and their upper limit 1/2. For part (c), (4) gives (for n > 4) 


293 4406147 0.277 


"= 720 ~ 435456002 * nt 


An+1— an > QO. 


SO 
4406147(2n + 1) 0.547 


~ 43545600n2(n +12. n4 


Moreover, b; = 0.3430..., bz = 0.3853 ..., b3 = 0.3965 ..., and bg = 0.4009. .., so the 
bn are increasing for all n. 


bn+1 — bn > 0. 


Solved also by J. Anglesio (France), P. Bracken, J. S. Frame, J. H. Lindsey II, R. Richberg (Germany), G. Rzadkowski 
(Poland), Z. Sasvari (Germany), GCHQ Problems Group (U. K.), and the proposers (parts a and b). 


Balanced and Unbalanced Polygons 


10526 [1996, 427]. Proposed by Harry Tamvakis, University of Chicago, Chicago, IL. Let 
P = A,A2...A, be aconvex polygon. For any point M in the interior, let B; be the point 
where A;M intersects the perimeter. We say that P is balanced if for some such M the 
points B,, B2,..., By, are interior to distinct sides of P. Prove or disprove: 

(a) If n is even, then P is not balanced. 

(b) If n is odd, then P is balanced. 


Solution by Mark Bowron and Stanley Rabinowitz, MathPro Press, Westford, MA. We prove 
(a) and disprove (b). 

(a) Suppose n is even. Pick any two opposite vertices of P, say A, and A,, where m = 
n/2 +1. The diagonal joining them divides P into two halves H; = A,A2...Am and 
Hy = AmAm+1...AnA,. Suppose M is a balance point (so that B,, Bo,..., B, are 
interior to distinct sides of P). Note that M cannot lie on A; A» since B} # A,,. Thus M is 
interior to either H; or H2. If M is interior to H2, then the m points B,, Bo, ..., Bm must 
lie on the m — 1 sides of PM H since P is convex. By the pigeonhole principle, some side 
of P therefore contains more than one B;. This is a contradiction. A similar contradiction 
arises if M is interior to H;. Therefore P is not balanced. 

(b) We give a counterexample (with n = 9) to show that P need not be balanced. Let a; 
denote the side of the polygon opposite vertex A;. If M is a balance point, then B; must lie 
on a;. Otherwise A; B; would divide P into two halves, with one half having more vertices 
(from which balance rays must emanate) than the other half has edges (to which balance 
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rays must terminate). Thus for all i, the point M must lie inside the triangle with vertex A; 
and opposite side a;. 

If the polygon in the figure contains a balance point, that point must lie in each of the 
three shaded triangles. But these triangles have no point in common. Hence the polygon is 
not balanced. 


An unbalanced 9-gon. 


Editorial comment. All convex n-gons for which n is odd and n < 7 are balanced. For 
triangles, any interior point is a balance point. For convex pentagons, any point within 
the interior pentagon formed by the diagonals is a balance point. Helly’s Theorem can be 
applied to show that all convex heptagons are balanced. 

Solved also by G. L. Body (U. K.), R. J. Chapman (U. K.), D. A. Darling, J. E. Dawson (Australia), H. Guggenheimer 


(part a only), E. Gutkin (part a only), D. C. Kay, J. H. Lindsey II, M. D. Meyerson, C. Popescu (Belgium), H. Sedinger, 
GCHQ Problems Group (U. K.), and the proposer. 


Is It An Integer? 


10527 [1996, 427]. Proposed by Vicentiu Pagol, Craiova, Romania. For positive integers 
m and n, let 


m /2 
N= 2 | sin2”—!/2 9 . cos2™+!/2 9 doa. 
0 


Prove that sin( (234-2) .N.- v2) — (0. 


Composite solution by Allen Stenger, Tustin, CA, and the late J. Sutherland Frame, East 
Lansing, MI. Write A = 23("+")-2 . N . ./2/1. The stated result is a roundabout way of 
saying that A is an integer. We exhibit A as a quotient of integers and show that each prime 
divides the numerator in this quotient to at least as high a power as it does the denominator. 

We use two [-function formulas in E. T. Whittaker and G. N. Watson, A Course in 
Modern Analysis, Cambridge, 1927, sections 12.42 and 12.14. The first is N = "'(m+3/4) 
P(n+1/4)/l(m+n-+1). The second is the functional equation P'(z)F(1—z) = z/sinzz 
in the particular case z = 1/4, which yields P(1/4)P'(G/4) = n/2. This gives 


4m—14m—5 3, (3\4n—-34n-7 1 (1 
4 4 4 \4}) 4 4 4 \4) 72 
me 


A = 23mtn)—2 
(m+n)! 
_ gmtn—2 (4m — 1) (Am — 5)---3- n= 3)4n—-T) 1 pv? 
(m+n)! - 
— gmtn—1 (am — Gm — 5) +++ 3° Gn = 3) Gn = Tt 
- (m+n)! 


The exponent of the highest power of p dividing n! is }°,,, |n/p* |. For p = 2 the 
exponent of the power of 2 in the denominator of A is 


|< Damn 


k>1 k>1 


hence is not greater than m + n — 1, the exponent of the power of 2 in the numerator. 
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Now we consider the odd primes. Writing A = A(m,n) as 
(1 — 4m)(5 — 4m) ---(—3)-1-5---(4n — 7)(4n — 3) 

(m+n)! 
we see that the numerator is a product of m+n consecutive terms in an arithmetic progression 
of difference 4. For any (odd) p*, exactly one in every consecutive p* terms of this product 
is divisible by p*, so there are either |(m + n)/ p* | or [((m +n)/p*] multiples of p* in the 
product. Thus the power to which p divides the numerator is at least as large as the power 
to which it divides the denominator. 


qm+n—l(_yym 


Editorial comment. Another way to proceed after obtaining the rational formula for A is to 
establish that A(m, m) and A(m, m + 1) are integers and then invoke a recurrence relation. 
Both Jean Anglesio and Calin Popescu converted the original integral for N to the form 


0° tan 
N= N(m,n) =4 [ Gayman H (*) 


with the change of variable t = tan@. Anglesio then evaluated N(m, 0) and N(m, 1) by 
differentiating N,»,(x) := 4 fo” dt /(x* + 14)y"+! with respect to x repeatedly, and obtained 
the product for A by arecurrence relation. Popescu converted the integral in (*) toa sum of 
multiples of integrals of the form fo. (t+-+1)~* dt and evaluated these by contour integration 
and residues. 

Solved also by J. Anglesio (France), G. Bach (Germany), D. Beckwith, D. Borwein & G. Sinnamon (Canada), R. J. Chap- 
man (U. K.), J. E. Dawson (Australia), R. A. Groeneveld, W. Janous (Austria), L. E. Mattics, C. Popescu (Belgium), 
R. Richberg (Germany), V. Schindler (Germany), T. V. Trif (Romania), GCHQ Problems Group (U. K.), NSA Problems 
Group, and the proposer. 


The Spiral of Cornu Has No Self-Intersections 


10530 [1996, 509]. Proposed by Daniel Goffinet, Saint Etienne, France. The Cornu spiral 
in the complex plane is defined by the parameterization 


t 
tt z(t) - | el 12 gy, 
0 


The eye sees no self-intersections. Is this a correct observation? 


Solution by Joel Zeitlin, California State University, Northridge, CA. Note that z(t) is above 
the real axis for t > O and below fort < 0, so it suffices to consider only t > 0. Also, 
z(t) is a unit speed curve with strictly increasing curvature k(t) = mt. This implies that 
the osculating circles are strictly nested; see J. J. Stoker, Differential Geometry, Wiley- 
Interscience, New York, 1969, p. 31, or J. Zeitlin, Nesting behavior of osculating circles 
and the Fresnel integrals, Math. Mag. 54 (1981) 76-78. Since each point lies on its own 
osculating circle, it cannot coincide with any other point on the curve. 


Editorial comment. Some solvers showed that the distance to the limit point is strictly 
decreasing. Others showed that the integral from a to b is non-zero for any a < b. 


Solved also by K. Andersen, N. Blachman, R. J. Chapman (U. K.), D. Constales (Belgium), D. A. Darling, J. E. Dawson 
(Australia), E. Heil (Germany), J. H. Lindsey II, M. Omarjee (France), P. Walker (Oman), GCHQ Problems Group 
(U. K.), Wilmer Alabama Mathematics Club, and the proposer. 


Choosing Random Numbers Until the Sum Meets a Threshold 


10531 [1996, 510]. Proposed by Emeric Deutsch, Polytechnic University, Brooklyn, NY, 
and Ieda Rodrigues, Cleveland State University, Cleveland, OH. Let x > 0. Show that 
eA (- Dia — gyter~4 


7 <2x+1. 


q=0 
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Solution by Lise Jensen, Northeastern Illinois University, Chicago, IL. Let f(x) denote 
the function on the left side of the inequality. Then (i) f(x) = e* if 0 < x < 1 and 
f(x) = e& — (x — let! if 1 < x < 2, (ii) f(x) is continuous for all x, and (iii) for all 
x >1, 


[foam soy-1. (1) 


Statement (i) follows directly from the expression for f and (ii) follows from observing that 
if x is not an integer, then the summand is continuous and if x is an integer andqg = x, then the 
summand is 0. Statement (iii) can be seen formally (and proved rigorously) by observing 
that the Laplace transform of the function given by (i) and (iii) is 1/(s — 1) +e~* = 
y~0(—1)4(s — 1)~ 9+) e-45 and that (s — 1)~49+"e-95 is the Laplace transform of 
Og (x)(x — q)4e*~4/q! where 6,(x) = 1 if x => q and = 0 otherwise. It can also be proved 
directly by integrating by parts, shifting indices, and using induction. 
Consider the linear functional equation 


F(x) = 1+ f F(t) dt. (2) 
x—1| 


For every g continuous in (0, 1], (2) has a unique solution F, such that F, = g in the interval 
(0, 1]. To prove this, note that if F is known for0 < x < N+1,thenforN <x < N+1we 
have F(x) = @(x) + fy F, where ¢ is a known function; this can be solved for F in terms 
of d. Also, note that F, is positive if g is positive and that Fy > Fy if g(x) > h(x) > 0 
for all x in (0, 1]. The function 1 + 2x satisfies Equation (2), so Fj42,(x) = 1+ 2x and 
F.x(x) = f(x). Now e* < 1+ 2x forO < x < 1, since e* — 1 — 2x is Oat 0, is negative 
at 1, and has its only minimum at log 2. It follows that f(x) < 1+ 2x for all x > 0. 


Editorial comment. The proposers explain the problem in the following way. Suppose one 
keeps selecting (independently) random numbers uniformly distributed in the interval [0, 1] 
until the total exceeds x. The number of selections is arandom variable Z, having possible 
values 1, 2,3, .... The expected value of Z, turns out to be the sum in the problem. 


Solved also by D. A. Darling, L. E. Mattics, GCHQ Problems Group (U. K.), and the proposers. 
The Iterated Sine Sequence 


10535 [1996, 510]. Proposed by Vladimir Jankovié and Jovan Vukmirovié, Belgrade, Yu- 
goslavia. Given sg withO < so < 1/2, use 5,41 = sin S, to define the sequence (s). Show 
that n*s? — 3n + (9/5) Inn is convergent. 


Solution by Robin J. Chapman, University of Exeter, Exeter, UK. Since 0 < sinx < x if 
0 <x < 7/2, it follows that (s) is a decreasing and bounded sequence. Its limit s satisfies 
sins = s, and sos = 0. From the Maclaurin series for the sine function, 


2 54 
Sn+1 = Sn (1 — S + 10 + 0188) . 
Setting u, = 1/s? gives 
1 1 3 
net =i (4+ 5+ ag + Olt )). (*) 


Since u, — oo asn — ooit follows from (*) that for large enough n we have uyj4| —Un > 
1/4, and so u, > n/4— A for some constant A. Hence ung) — un = 1/3 + O(n7!). 
Summing this gives u, = n/3+ O(logn), andso1/u, = 3/n+O ((log n)/n?). Inserting 


this in (*) gives 
; Ly 1 i, logn 
Unt] ~Un = = t+— ; 
n+l n 3 5n n2 
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l 


Since )-,., n 7 log n is convergent and via! j~° =logn+y _+o(1) (where y is Euler’s 


constant), we have 


] 
Un = toe + K +0(1) 
3 5 
for some constant K. Hence 
2 9] 
n2s? = = 3n — 2" —9K +o()), 
Un 5 


and so n*s? ~3n+2logn — —9K asn —> oo. 


Editorial comment. Several solvers noted that the iterated sine sequence of this problem is 
studied in N. G. de Bruijn, Asymptotic Methods in Analysis, Dover, 1981. On p. 159, de 
Bruijn proves the stronger statement 


ou =f (1-2 oee Cy wloeint Blogn ty | 4 (losin) 
n 10 n 2n n2 a 


where C depends on So and where a, B, and y are explicit polynomials in C. 


Solved also by U. Abel (Germany), S. Amghibech (France), J. Anglesio (France), D. Constales (Belgium), D. A. Darling, 
D. Doster, L. Jensen, K.-W. Lau (Hong Kong), J. H. Lindsey II, L. E. Mattics, M. Omarjee (France), A. Stenger, P. Walker, 
GCHQ Problems Group (U. K.), and the proposers. 


A Pair of Exponential Equations 


10537 [1996, 598]. Proposed by Jonathan Aronson, Carnegie Mellon University, Pittsburgh, 
PA. (a) For which positive real numbers c do there exist real numbers A and B with 
B =ce~4, A =ce~8, and A ¥ B? (b) Show that AB < 1 when such A and B exist. 


Composite solution I by Gilbert N. Lewis, Michigan Technological University, Houghton, 
MI, and Erik Doeff, Montana State University, Bozeman, MT. We must have A, B > 0, and 
we may assume A > B > 0. A solution to B = ce~4 and A = ce~8 is equivalent to a 
solution to Ae~4 = Be~® andc = Be4 = Ae®. The function f(x) = xe~* is increasing 
on (0, 1) and decreasing on (1, 00). Hence A > 1 > B, and B is implicitly defined as 
a function of A. Integrating (x — 1)*/x? = 1 — 2/x + 1/x? > 0 on (1, 00) shows that 
x —~2Inx — 1/x > 0on (1, oo). Exponentiating and rearranging gives e~!/*/x > xe7*. 
Hence f(1/A) > f(A) = f(B). Since f is increasing on (0, 1), this implies 1/A > B or 
AB <1. 
Now In A — A = 1nB — B, so differentiating implicitly gives 


dB (1—A)B 
dA  A(1—B) 
and d dB 1—AB 
C —_ 
—~ = (B+ —)e4 = ——__ —_ Be4 5 0. 
dA ( + Fe AI—B) ~ 


Hence c is an increasing function of A for A € (1,00). As A > 1, B > 17 and hence 
ce. AsA > oo, c = Ae® > A also tends to oo. Thus c attains exactly the values in 
the range (e, 00). 


Solution II by Harry Sedinger, St. Bonaventure University, St. Bonaventure, NY. Let g(x) = 
ce-~. The existence of A and B is equivalent to having two fixed points of f = gog 
other than the unique fixed point P of g. Other than P, the fixed points of g o g come in 
pairs A and g(A). Note that A and g(A) occur on opposite sides of P, hence A ¥ g(A). 
Now f’(x) = c7e7~ +8) > Oand f”’(x) = c7e~%+8@))(—1 + ce-*). Therefore f is 
increasing, concave upward for x < Inc, and concave downward for x > Inc. Also note 
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that f(0) > O and f(x) — cas x — oo. There are two cases: (i) P is the unique fixed 
point of f and f’(P) < 1, and (ii) there are three fixed points of f, A < P < B, f’(A) <1, 
f’(P) > 1, and f’(B) < 1. 

Note that f’(P) = c?e~?? = P?%, so (i) occurs when P < 1, or equivalently when 
c < e, and (ii) occurs when P > 1, or equivalently when c > e. This solves (a): a solution 
exists if and only if c > e. To see (b), note that in case (ii), 1 > f’(A) = c7e~“4+84)) = 
ce~4ce~8 = BA. 


Editorial comment. Frank J. Flanigan noted that part (a) is equivalent to problem 1313 in 
Math. Mag. [1989, 58; 1990, 59], which asks for pairs of points (x, y) on the exponential 
y = a* that are symmetric about the line y = x. If A, B, and c satisfy the equations of 
the problem, then the points (A/c, B/c) and (B/c, A/c) lie on y = (e~°)*, and conversely. 
Can A. Minh noted that part (b) appeared as problem 27.8 in Math. Spectrum 28 No. 1 
(1995/6). Several solvers noted that one can avoid implicit differentiation in solution I by 
setting r = B/A and solving for A, B, and c as functions of r. 

Solved also by P. Alsholm (Denmark), S. Amghibech (France), K. .F. Andersen (Canada), J. Anglesio (France), R. Bar- 
bara (Lebanon), M. Brozinsky, P. Budney, D. Callan, R. J. Chapman (U. K.), G. G. Chappell, D. Constales (Belgium), 
D. A. Darling, J. E. Dawson (Australia), F. J. Flannigan, K. Ford, J. Gutiérrez (Spain), J. L. Hartman, G. Isaacs, 
R. A. Kopas, O. Kraft & M. Schaefer (Germany), G. Lafferriere, J. H. Lindsey II, R. Manning, A. Nijenhuis, K. Schilling, 
H.-J. Seiffert (Germany), M. Shemesh (Israel), W. R. Smythe, R. Weinstock, L. Widmer, Anchorage Math Solutions 


Group, Con Amore Problems Group (Denmark), GCHQ Problems Group (U. K.), NCCU Problems Group, NSA Prob- 
lems Group, Wilmer Alabama Mathematics Club, and the proposer. 


REVIVALS 


A Special Sequence of Algebraic Integers 


E 3461 [1991, 755; 1997, 171]. Proposed by David Callan, University of Wisconsin, 
Madison, WI. Suppose r is arational number but not an integer. It is known that tan(rz /2) is 
an algebraic number; see Ivan Niven, /rrational Numbers, Carus Mathematical Monographs 
No. 11, pp. 37-41. Find the smallest positive integer k, such that k, tan(r7/2) is an algebraic 
integer. 


Editorial comment. Unfortunately, there were two minor errors in the published solution. 
First, the opening sentence of the solution was inaccurately worded. It should have read: 
“Tf the denominator of r/2 in lowest terms is twice a power of an odd prime p, then k, = p; 
otherwise k, = 1.” In other words, for k; # 1, one needs not only that the denominator of 
r iS a power of an odd prime but also that the numerator of r is odd. 
Second, it was asserted that, forn > 2, irreducibility of the cyclotomic polynomial ®,, (z) 
over Q implies irreducibility over Q of the polynomial 


P(t) = (1 — it)? 6, (; + ) . 


1 —it 


The example 4(z) = z* + 1, P4(t) = 2 — 2t? shows that this argument is specious. 
However, irreducibility of P,,(t) is really needed only when n is twice a power of an odd 


a-l 


prime. If p is prime and q is a positive integer, then B2,«(z) = (1 + zP) /1+2z?  ), so 


1) P* —~ it)” 
P2 p(t) — ty te ay _— 2(1+ pA(t)) ae | + pCi(t), 
(1 +it)P™ +(1—it)P 2(1+ pB(t)) 
where A(t), B(t), and C(t) are polynomials with integer coefficients, and the leading co- 
efficient of C(t) is (—1)?—)/?. Irreducibility of P2p«(t) then follows from the Eisenstein 
Irreducibility Theorem; see Harry Pollard and Harold G. Diamond, The Theory of Algebraic 
Numbers, Carus Mathematical Monographs, No. 9, pp. 30-35. 
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REVIEWS 


Edited by Underwood Dudley 
Mathematics Department, De Pauw University, Greencastle, IN 46135 


Conceptual Mathematics: A First Introduction to Categories. By F. William Lawvere & 


Steven Schanuel with the cooperation of Emilio Faro, Fatima Fenaroli, Danilo 
Lawvere and the students of Mathematics 108 of the University of Buffalo. 
Cambridge University Press, 1997, 358, $80.00 (hard bound), $25.95 (paper bound). 


Reviewed by Saunders Mac Lane 


The amazing science of mathematics involves a mixture of calculations and 
concepts. When we teach, it is often easier to expound the calculations then it is to 
explain the underlying conceptual structures. But they are there—and now formu- 
lated for freshmen in college in this attractive new text, emphasizing the ideas 
about categories. 

Category theory provides a simple and systematic conceptual framework for 
mathematics including the ideas of sets, functions between sets and of “sets with 
structure’, and the axiomatic formulations of the composition of functions. This 
text, written by two experts in category theory and tried out carefully in courses at 
SUNY of Buffalo, provides a simple and effective first course on conceptual 
mathematics. It offers a careful elementary description of sets and of categories, 
with many examples—and leading up to many fascinating matters, including the 
Brouwer fixed point theorem, Cantor’s diagonal argument, and the Godel num- 
bers. 

The text starts with the idea of multiplication (for example, Space = Plane x 
Line) and shows how this idea leads to Cartesian coordinates and to the descrip- 
tion of a product of two objects by diagrams using the projections of the product of 
two spaces on its factors. 

The first example of a category is the category of finite sets and their maps. This 
is illustrated by both internal diagrams (which element goes where) and by external 
diagrams (arrows from the domain of a map to the codomain). This leads to the 
idea of the composition of maps and the associative law for such composition. 
Here, as throughout the book, the formal definition is followed by many carefully 
described examples and computations. Also included are typical comments from 
students and the suggested clarification of their occasional puzzlement. In this 
way, the book presents an effective dialog with ideas. 

From the category of finite sets, the text goes on to the general definition of a 
category, with its objects, maps between these objects, the composition of these 
maps, the associative law for composition of maps and the identity maps. Various 
examples illustrate the importance of the order in which two maps may be 
composed. In general each section of the book is followed by worked-out exercises, 
examples of student response, and more exercises. 
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Special kinds of maps—isomorphisms, the sections and the retracts of a given 
map, idempotents, injections, etc., are illustrated, defined, and discussed. An 
informal discussion of continuity indicates that it is plausible that the inclusion 
map j: C ~ D of the boundary circle C of a disc D 1s not a retract. Using this 
plausible result, the authors give the standard proof that any continuous map of 
the disc into itself necessarily has a fixed point (the Brouwer fixed point theorem). 
Here and elsewhere, the text is careful in the construction and exposition of 
proofs, so that it provides for the student an effective introduction to the basic idea 
of ‘proof’. 

Part III provides a number of examples of easily accessible categories: the 
category of sets, the category of sets with an endomorphism (also called the 
category of automata or of discrete dynamical systems), the category of irreflexive 
graphs (where each edge is directed from one vertex (dot) to another), the category 
of reflexive graphs (with an “identity arrow” at each vertex), the category of sets 
each with an idempotent endomap and the category of arrows of any given 
category. With these examples, the student is given a view of the idea of a “set 
with structure” and of structure preserving maps. Again, there are many exercises 
and comments. 

Finite state automata provide a next example of a category. This study of 
automata leads to examples of cycles of various finite lengths and to the introduc- 
tion of the natural numbers as the automaton N consisting of the string of natural 
numbers with the successor map. This example of a “structure” suggests the idea of 
a map that preserves a given structure. In turn, this leads to the definition of a 
functor as a map of categories that preserves all the category structure. Then 
comes the general idea of studying large “objective” categories by functors to them 
from small “test” categories, with an outlook on the notion of symmetry. 

Universals, such as product objects and terminal objects, next appear. A 
“terminal” object 1 in any category is one such that any object X has a unique or 
“universal” map X — 1 to this terminal. A product, X x Y of two objects X and Y 
with its projections is a diagram 


X—-—XXY->Y 


such that any other pair of maps X < A — Y to X and Y can be composed from 
the two projections and a unique map A — X & Y. By reversing all the arrows in 
this definition one has the definition of the “dual” operation of “sum’—or 
“coproduct’’. In a category with both products and sums these definitions deter- 
mine a standard map 


AXBt+tAXC>AX(BHtC). 


The category is said to satisfy the “distributive law’ when this standard map is an 
isomorphism (examples). There is a careful proof of the uniqueness (up to 
isomorphism) of the product and the sum when they exist. 

The fifth chapter introduces the “map objects” Y’. For example, given sets Y 
and T, the set Y’ of all the maps f: T— Y is such a map object. It comes 
together with the corresponding “evaluation” map 


Y'’xT-oY 


and the “universal” property of this evaluation. More generally, in any category 
with products and a terminal object, an object X is said to parametrize all the 
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maps T — Y when there is some one map f: T x X ~ Y such that any map 
T— Y has the form f(—,.x) for some “point” x: 1 — X of X. Then Cantor’s 
diagonal theorem holds in any such category: if an object T has enough points to 
parametrize all maps T > Y by means of some map T x T — Y, then Y has the 
fixed point property. In particular, the set 2 of two elements does not have the 
fixed point property. Therefore, for all sets T, one has T < 2’, since 2’ does 
parametrize all maps T > 2. 

After further careful discussion and examples of exponentials, the text turns to 
the consideration of “parts” (that is, subobjects S) of an object X, and of the 
characteristic functions h: X — 2 of such subobjects. This characteristic function 
has the property that the diagram 


S —1 


{| 


h: X —>2 = (0,1) 


is a “pullback’’. But this usual set 2 consisting of the two truth values 0 and 1 must 
in many categories be replaced by a more sophisticated truth value object. This 
leads to the definition of a topos: a category with a terminal 1 and initial 0, with 
products, pullbacks, sums and all exponentials, together with a suitable object tf: 
1 — C of truth values, such that any part of an object X has a unique characteris- 
tic function X — Q, much as in the diagram above. 

All told, this text on conceptual mathematics thus succeeds in presenting a 
careful introduction of concepts leading up to non-trivial examples. This reviewer 
has not yet had occasion to try this text out with students, but he is confident that 
the care and the wealth of examples will be successful in explaining to students the 
idea of a proof and the concepts such as “sets with structure” and “structure-pre- 
serving” maps. Students are guided to the art of saying things “right”. Thus on page 
279, Chad is asked 


“To say that T is a terminal object in the category C means what?” 
Chad responds “That there is only one map”. 

‘One map; from where to where?” 

Chad: “From the other object to 7”’. 

‘What other object?” 

Chad: “Any other object”. 

“Right. From any other object. So start the sentence with that; 
don’t leave it for the end”’. 


In this way Chad and the other students are carefully led to say the things 
necessary for clarity. The careful use of concepts and their examples through this 
text leads to precision and thus to understanding. 
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Early Astronomy. By Hugh Thurston. Springer-Verlag, 1994; first softcover printing, 
1996, x + 268, $29.95. 


Reviewed by Ezra Brown 


Books have always been willing accomplices in my casual but life-long interest in 
astronomy and outer space. My early interest in stargazing had been aided and 
abetted by two books. One was The Stars: A New Way to See Them, written by 
H. A. Rey [8] (yes, the author of the “(Curious George” books!), which did indeed 
redraw many of the constellations so as to make them more recognizable to 
budding young astronomers. The other was Henry M. Neely’s fascinating The Stars 
by Clock and Fist [5], which described a method—which I still use and which works 
beautifully—for locating stars and constellations in which the only thing you 
needed, besides the book, was an ability to find the North Star. Its author dreamed 
up the clock-and-fist system while teaching adult education classes in astronomy at 
the Hayden Planetarium back in the Truman administration. 

Since solving mysteries for the uninitiated—in particular, for children—was, in 
essence, what these books were all about, they had to be well written and clear, or 
nobody would read them. And they were. Because of them, I could pick up other 
books on “amateur” astronomy (i.e., any book whose Library of Congress classifica- 
tion number begins with QB63) and make reasonable sense of them by translating 
right ascensions and declinations, as well as a variety of star charts and maps, into 
the languages and pictures of Rey and Neely. Furthermore, if a really well-written 
book on astronomy fell into my hands, it was eagerly read. 

And so it was that when Gerald Hawkins’ Stonehenge Decoded [2] came out 
many years later, I devoured it. Here was a tantalizing mystery involving astronomy 
as practiced by people who really wanted to understand the sky; so great was their 
fascination that they created a structure that was part scientific observatory and 
part sacred space. The mystery was just this: why was Stonehenge built in that 
manner and in that location, and how was it done? Well, this story is now well 
known, and a number of Hawkins’ deductions about Stonehenge have been 
subsequently modified. The point is that Stonehenge Decoded was, and remains, a 
well-written book—for adults, even—about a subject of great interest to many lay 
persons, including myself. 

Now, let’s jump ahead a decade or so. During the mid-80’s, I had acquired a 
Semiprofessional interest in Early Modern Science. After teaching the history of 
mathematics a few times, I was not happy with the various discussions about the 
mathematics of the Renaissance. This included, of course, applications of mathe- 
matics—in particular, an old friend from the Greek quadrivium, namely astron- 
omy. I began trying to understand how Copernicus might have arrived at the 
theory that became his De revolutionibus orbium coelestium. Subsequent investiga- 
tions into Copernicus’ predecessors led to Ernst Zinner’s biography (in German) of 
that energetic mathematician and astronomer, Johann Miller of Konigsberg, 
called Regiomontanus (1436-1476) [10]. Following a conversation with my friend, 
historian of science David Lux (“Why not translate it yourself? At a page a day, 
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you'll be done in a year!”), I did just that. (It was even published—but that’s 
another story.) 

This is all by way of explanation that, when Hugh Thurston’s Early Astronomy 
arrived in the mail, I could hardly wait to get my hands on it. The first thought that 
sprang to mind was, “Where was this book when I needed it?” For it would have 
been—I anticipated—of great help in filling in the multitude of gaps in my 
knowledge of the details of early astronomy. Such a book, I thought, would 
describe the instruments used in the pre-telescopic era, explain about oppositions, 
conjunctions, proper motions, synodic periods, and a host of other astronomical 
terms, identify the principal problems ancient astronomers were trying to solve, 
and give some indication of why they were important. 

The fairly detailed table of contents looked promising. The book begins with a 
preliminary chapter of forty-four pages on early stargazers including such standard 
topics as the rotation of the heavens, the sun, moon, planets and stars, and the 
tools early astronomers used. The next nine chapters cover the astronomy of 
several early civilizations, including the megalithic world, Mesopotamia, Egypt, 
China, Greece, India, the Arabic-speaking Islamic world, the Mayas, and the 
European Renaissance. About one-fifth of the book is devoted to the achieve- 
ments of Hipparchus and Ptolemy, and this discussion is supplemented by six 
appendices. Much would be made clear to me about a subject that my sainted 
Aunt Mildred would have called a hegdesh. 

What’s a hegdesh? Well may you ask. Here’s an analogy. You go into an antique 
shop where you know there are wondrous things to be had, but in order to find 
them, you have to pick your way through large piles of things stacked here and 
there. The aisles are narrow, and you know that there’s some hint of an orderly 
plan, but it’s hard to figure out just what that plan is. Furthermore, occasional 
signposts are wrong and others are quite mysterious. Also, in your searching, you 
have to pay careful attention to details. After a great deal of peeping in here, 
poking in there, dusting off this, thinking about that, you finally come to some 
gems. Aunt Mildred would have called this antique shop a hegdesh. 

Early astronomy is a bit like that. Thurston himself sets the tone of it all on 
page 4, where he is talking about the constellations: “The shapes of the constella- 
tions are fixed. They do not change like the pattern made by a flock of starlings 
against the sky. But their positions are not fixed: the complete bowl rotates about a 
fixed point once a day. This point is called the celestial pole.” Then he trumps his 
own ace on page 7, pointing out that ‘“(1) the center of rotation is not fixed; (2) the 
pattern is not fixed; and (3) one rotation does not take exactly one day.” 

In a nutshell, this was the essential problem facing early astronomers. Celestial 
objects appear to move in nice, simple, orderly patterns. But they do not. The total 
motion of a celestial object is mainly regular, but with small variations, not obvious 
to the casual eye, that become more discernible with long-term observations: 
general regularities tempered with small irregularities. Knowing this prompts many 
que$tions. What tools did the early astronomers use to measure time, directions, 
and motion? At what point did they begin to use mathematics to make some 
attempt at predicting future positions of celestial objects? In short, How Did They 
Figure It All Out? 

Alas, this book is not an easy read—in some places, it 1s very slow going—and 
SO it became clear that this would not help make early astronomy less of a hegdesh. 
One difficulty is that Thurston’s treatment of his subject matter is very under- 
stated. Another is that his writing style is more descriptive than compelling. 
Curious about the author’s motivation and purpose for writing the book, I turned 
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to the introduction, which states that “This book covers astronomy from the 
beginning up to the time of Kepler. Most of this was developed at a time when 
other sciences, notably physics and chemistry, had scarcely started. The scientific 
edifice built up over this period is one of the triumphs of human intellect.” There 
is no statement of purpose or “why this book came to be,” other than to cover the 
material. And nothing turns students off more than professors who just “cover the 
material.” 

Take the topic of eclipses, for example. There is all of the information necessary 
to understand the motions of the sun and the moon and to understand why 
eclipses Seem to occur at approximately regular intervals, but it is not written in a 
way designed to excite or entice the reader. Another example is the treatment of 
Stonehenge. Thurston ends his discussion on Stonehenge by pointing out that the 
alignments there were set up using only very painstaking and reasonably accurate 
observations—as opposed to advanced mathematical or astronomical theories— 
and states that the marvel of the place is not in its astronomy but in its 
construction. But Hawkins’ book on the same subject conveys a great deal of 
wonder both at the construction and at the observations. The wonder also lay in 
figuring out that yes, the structure’s alignments are astronomical in character, and 
this comes through in Hawkins’ writing very clearly. For Thurston, the wonder 
apparently disappears once you See “how the trick is done.” 

In addition, there are some occurrences in the book that are somewhat 
distracting. One that turns out to be amusing is in a passage on the orientation of 
the pyramids: ‘To find north or south accurately we must use the stars; the sun is 
too big and too bright to yield an accurate result. I describe one way in which 
north can be found on page 26.” But this passage is on page 26! (After a number 
of false starts and failed searches, I found the reference close to the end of page 
26.) Not so amusing is a passage on Hipparchus and the length of the year. In one 
sentence, it is written that “the time interval between the summer solstices of 1990 
and 1991 was 365.2403 days, whereas between the summer solstices of 1991 and 
1992 it was 365.2465 days.” A couple of sentences later, we read that “the year 
A.D. 1 was 365.242187 days long and the year decreases by 0.000006 days per 
century.” These two sentences seem to contradict one another. A third example is 
this: although there is a great deal of discussion about large-scale chronology—the 
different kinds of years and months—there are almost no statements about how 
these early astronomers measured the passage of time during a day. It is men- 
tioned that the Babylonians timed the culmination (highest point) of one star by 
recording which other star is rising, but that is about it for “telling time.” Finally, 
I was disappointed that Regiomontanus does not appear in the book (I'll get over 
it). 

But let us return to the writing style. Since there is a great deal of terminology 
to absorb, it might be the case that something intrinsic in the subject of ancient 
astronomy lends itself to tediosity (if there is no such word, there should be). So, I 
looked through several other books on the subject. 

I began with Neugebauer’s The Exact Sciences in Antiquity [6] and van der 
Waerden’s Science Awakening II: The Birth of Astronomy [9], just to make sure. 
They include as much detail as does Thurston (maybe more!), but they have been 
written in such a way as to make the subject matter compelling. They set the hook 
and lure the reader on, giving manageable bites of technical information on which 
to chew. 

I also read through parts of Neugebauer’s monumental three-volume treatise, A 
History of Ancient Mathematical Astronomy |7]|, which is both unapologetically 
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mathematical—not for the faint of heart or the “trigonometrically challenged,” 
and fairly slow going even for an interested mathematician—and an absorbing 
read. This treatment includes lots of illustrative examples worked out in detail, so 
that the exposition is straightforward to follow. 

Yet another path to early astronomy is the field of archaeoastronomy—the 
interdisciplinary study of ancient, prehistoric, and traditional astronomy and its 
cultural context. If you are interested in this point of view, then E. C. Krupp’s 
Echoes of the Ancient Skies [4] is a treat. This book is chock-full of information 
about ancient and prehistoric observatories, myths, calendars, ceremonies and 
cosmologies from around the world, but it also contains enough of the essentials 
on how the heavens move to interest any reader who is a hard-core astronomy 
buff. Krupp is clearly caught up in the wonder and the beauty of it all, and his 
writing shows it. Other books along the same line are two collections, Astronomy of 
the Ancients, edited by K. Brecher and M. Feirtag [1], and In Search of Ancient 
Astronomies |3], edited by Krupp. 

What it looks like is that Thurston’s book is “neither fish nor fowl nor good 
red herring”: it’s not strictly (1) mathematical, (2) historical, or (3) mystical / 
religious /interdisciplinary. Also, its writing style is not very compelling. But it 
contains the information—it covers the subject. Also, the gems are there for the 
polishing, but you might have to work at it. (One of the gems is almost at the end 
of the book: Thurston’s description of Kepler’s efforts at finding the orbit of Mars. 
Would that the entire book were that intriguing!) 

Oh, yes. You may have been wondering, “Just what is Neely’s clock-and-fist 
method?” Easily explained. Instead of using compass directions to tell you which 
way to face in order to see a particular constellation, you use the clock directions 
with 12 o’clock being north (toward the North Star). And instead of using a sextant 
to measure angles up from the horizon, you use your fist to get a line-of-sight to 
the horizon, and then sight upward however many fists you need to get to the 
particular constellation. Here, the convention is that from horizon to zenith equals 
nine fists. And now, go and explore the night sky yourself. Have fun...there are 
great wonders out there to behold! 
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1—4: Semester 


General, T*(14: 1), S*, L*. Laboratories 
in Mathematical Experimentation: A Bridge to 
Higher Mathematics. Mount Holyoke College. 
Springer-Verlag, 1997, xix + 278 pp, $34.95 (P). 
[ISBN 0-387-94922-4] Sixteen, basically in- 
dependent, labs that help students understand 
the experimental nature of mathematics before 
they delve into theoretical courses. Access to 
computers is required; knowledge of calculus is 
not. An excellent resource. CEC 


General, P**, L*. Indiscrete Thoughts. Gian- 
Carlo Rota. Ed: Fabrizio Palombi. Birkhauser 
Boston, 1997, xxii + 280 pp, $36.50. [ISBN 0- 
8176-3866-0] In wide-ranging essays, mem- 
oirs, book reviews, and other assorted genres, 
Rota surveys nearly 50 years of life in aca- 
demic mathematics and philosophy. The menu 
includes social commentary, mathematical gos- 
sip, academic philosophy, advice to the young 
(“Ten Lessons I Wish I had Learned’), and 
much more. Learned, thought-provoking, po- 
litically incorrect, delighting in paradox, and 
likely to offend—but everywhere readable and 
entertaining. PZ 


General, P. The Legacy of Norbert Wiener: 
A Centennial Symposium. Eds: David Jerison, 
I.M. Singer, Daniel W. Stroock. Proc. of Symp. 
in Pure Math., V. 60. AMS, 1997, xix + 405 pp, 
$80. [ISBN 0-8218-0415-4] Proceedings of a 
1994 event at M.L.T. 


Reference, P, L. Handbook of Mathematics. 
I.N. Bronshtein, K.A. Semendyayev. Transl: 
K.A. Hirsch. Springer-Verlag, 1997, xv + 
973 pp, $57. [ISBN 3-540-62130-X] Reprint 
of the revised third edition (TR, October 1986). 
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Recreational Mathematics, P, L. The Green 
Book of Mathematical Problems. Kenneth 
Hardy, Kenneth S. Williams. Dover, 1997, ix 
+ 173 pp, $6.95 (P). [ISBN 0-486-69573-5] 
Republication, with corrections, of The Green 
Book: 100 Practice Problems for Undergrad- 
uate Mathematics Competitions published by 
Integer Press in 1985 (TR, March 1986). 


Recreational Mathematics, P, L*. Penrose 
Tiles to Trapdoor Ciphers ...and the Return of 
Dr. Matrix. Martin Gardner. MAA, 1997, ix 
+ 319 pp, $27.95 (P). [ISBN 0-88385-521-6] 
This revised edition features a new bibliogra- 
phy, corrections to the text, and a postscript 
written by the author. (W.H. Freeman edition, 
TR, March 1989.) 


Education, P, L. Bold Ventures, Volume 1. Eds: 
Senta A. Raizen, Edward D. Britton. Kluwer 
Academic, 1997, xviii + 249 pp, $125. [ISBN 
0-7923-4231-3] Overview and synthesis of 
innovations in U.S. mathematics and science ed- 
ucation based on eight case studies (five in sci- 
ence, three in mathematics) carried out over the 
past five years. Covers the context and motiva- 
tion of reform; changing conceptions of science, 
mathematics, and instruction; changing roles 
of teachers; and “‘underplayed” issues (assess- 
ment, evaluation, equity, and diversity). LAS 


History, P. Helmut Wielandt: Mathematische 
Werke, Mathematical Works. Volume 2: Linear 
Algebra and Analysis. Eds: Bertram Huppert, 
Hans Schneider. Walter de Gruyter, 1996, xx + 
802 pp, DM 348. [ISBN 3-11-012453-X] 


Logic, P. Introduction to Mathematical Logic. 
Alonzo Church. Landmarks in Math. & 
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Physics. Princeton Univ Pr, 1996, ix + 378 pp, 
$19.95 (P). [ISBN 0-691-02906-7] Republi- 
cation of the 1958 corrected second printing. 


Combinatorics, P. Probabilistic and Analyt- 
ical Aspects of the Umbral Calculus. A. Di 
Bucchianico. CWI Tract, V. 119. Centrum 
voor Wiskunde en Informatica, 1997, 148 pp, 
Dfl. 35 (P). [ISBN 90-6196-47 1-7] 


Discrete Mathematics, P. Operations Re- 
search and Discrete Analysis. Aleksei D. Ko- 
rshunov. Math. & Its Applic., V. 391. Kluwer 
Academic, 1997, vii + 331 pp, $175. [ISBN 0- 
7923-4334-4] Translations of papers from the 
second volume of the Russian-language journal 
Diskretny” i Analiz i Issledovanie Operaatsit . 


Number Theory, P. Primes of the Form 
x? + ny*: Fermat, Class Field Theory, and 
Complex Multiplication. David A. Cox. Pure 
& Appl. Math. Wiley, 1989, xi + 351 pp, 
$49.95 (P). [ISBN 0-471-19079-9] Paper- 
back republication (TR, December 1990). 


Number Theory, P. Continued Fractions. 
A. Ya. Khinchin. Dover, 1997, xi + 95 pp, 
$6.95 (P). [ISBN 0-486-69630-8] Republi- 
cation of the English translation of the third 
Russian edition (1961) published by the U. of 
Chicago in 1964. 


Number Theory, S(15-16), P, L*. The Book 
of Numbers. John H. Conway, Richard K. 
Guy. Springer-Verlag, 1996, ix + 310 pp, 
$29. [ISBN 0-387-97993-X] A marvelous 
compendium of fascinating facts pertaining to 
numbers, from the natural to the surreal, by two 
masters of the field. BC 


Number Theory, T**(16—-17: 2). Elliptic 
Functions: A Constructive Approach. Peter L. 
Walker. Wiley, 1996, xv + 214 pp, $62.95. 
[ISBN 0-471-96531-6] A thoroughly modern 
introduction to the theory of elliptic functions 
suitable for use with (strong) undergraduates. 
Assumes only basic topology and a bit of com- 
plex analysis. Beginning with Eisenstein series 
(where else?), the author carefully constructs 
basic elliptic functions, theta functions, Jaco- 
bian functions, elliptic integrals, and modular 
functions. A perfect starter book for anyone 
wantmeg to work through Wiles’ proof of Fer- 
mat’s Last Theorem. MPR 


Linear Algebra, T*(14: 1), C. Interactive Lin- 
ear Algebra: A Laboratory Course Using Math- 
cad. Gerald J. Porter, David R. Hill. Springer- 
Verlag, 1996, $42.95 (P), with disks. [ISBN 
0-387-94608-X] Uses Mathcad (5.0 or 6.0) to 
create a discovery-based laboratory learning en- 
vironment. Covers the usual topics (eigenval- 
ues, linear transformations, and vector spaces, 
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etc.). Exercises ask for written summaries of 
the mathematics. The text is a printout of the 
electronic book. JNC 


Group Theory, P. Group Theory in China. 
Eds: Zhe-xian Wan, Sheng-ming Shi. Math. & 
Its Applic. Kluwer Academic, 1996, 261 pp, 
$120. [ISBN 0-7923-3989-4] 15 essays by 
former students and colleagues of Hsio-Fu 
Tuan. 


Algebra, S(15), L. Applied Abstract Alge- 
bra. Ed: S.K. Jain. Centre for Professional 
Development in Higher Education (Univ. of 
Delhi, Delhi-110 007, India), 1996, 105 pp, 
(P). Notes from a workshop at the University 
of Delhi. Deals with standard applications such 
as block designs, Burnside’s and Polya’s the- 
Orems, switching circuits, matroids, and cryp- 
tography. Includes references and occasional 
exercises. CEC 


Algebra, P. Ordered Algebraic Structures. 
Eds: W. Charles Holland, Jorge Martinez. 
Kluwer Academic, 1997, ix + 332 pp, $162. 
[ISBN 0-7923-4377-8] Proceedings of the 
June 1995 Curacao conference. 


Algebra, P. Foundations of Lie Theory and 
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